Journal Club

The IAIFI Journal Club is open to IAIFI members and affiliates.

Upcoming Journal Clubs

Past Journal Clubs

Spring 2025

Aishik Ghosh, Postdoc, University of California Irvine
- Tuesday, May 13, 2025, 1:00pm–2:00pm, IAIFI Penthouse
- Probing High-Dimensional Spaces: From Theory Design in Phenomenology to Parameter Inference in ATLAS
- When confronted with extremely high-dimensional problems, physicists traditionally reduce the challenge to a lower dimensional representation where they can build intuition. For example, experimental particle physicists may condense hundreds of millions of dimensions from detector readouts into a one-dimensional histogram of reconstructed particle energy. I will demonstrate that such dramatic data reduction makes it impossible to capture all the relevant information necessary for optimal statistical inference for the Higgs width measurement at the LHC. We design a generalisation of traditional methodology in ATLAS to perform statistical inference directly on high-dimensional data, enabled by powerful uncertainty quantification and propagation tools. This leads to the most precise measurement of the Higgs width by the experiment to date. Similarly, a significant challenge in theoretical physics is the vast mathematical space of potential theories to describe our Universe. I lead the design of a more efficient framework for model building in neutrino physics, leveraging newly available computational and AI tools to uncover new avenues for neutrino theory model building.
- Talk Slides (for IAIFI members only)
Shannon Greco, Science Education Senior Program Leader, Princeton Plasma Physics Laboratory
- Tuesday, May 6, 2025, 1:00pm–2:00pm, IAIFI Penthouse
- Bridging the Gap: Culturally Responsive Science Communication for Early Career Scientists
- Effective science communication is essential for fostering trust, relevance, and impact across diverse communities. This interactive workshop equips early career scientists with practical tools to communicate their research clearly and compellingly to a variety of audiences, from policymakers to the public. With a focus on two-way, culturally responsive engagement, participants will explore how to listen actively, adapt messages to different contexts, and co-create understanding with their audiences. The workshop models two-way engagement throughout, responding to participant needs and incorporating their experiences into real-time practice. Attendees will leave with enhanced communication skills and a framework for building authentic, inclusive connections through their scientific work. Note: The speaker will be on Zoom, but we will still serve lunch in the Penthouse for those who would like to participate in person.
- Talk Slides (for IAIFI members only)
Michael Winer, Scholar, Institute for Advanced Study
- Tuesday, April 29, 2025, 1:00pm–2:00pm, IAIFI Penthouse
- Deep Neural Nets as Hamiltonians
- Neural networks are complex functions of both their inputs and parameters. Much prior work in deep learning theory analyzes the distribution of network outputs at a fixed a set of inputs (e.g. a training dataset) over random initializations of the network parameters. The purpose of this article is to consider the opposite situation: we view a randomly initialized Multi-Layer Perceptron (MLP) as a Hamiltonian over its inputs. For typical realizations of the network parameters, we study the properties of the energy landscape induced by this Hamiltonian, focusing on the structure of near-global minimum in the limit of infinite width. Specifically, we use the replica trick to perform an exact analytic calculation giving the entropy (log volume of space) at a given energy. We further derive saddle point equations that describe the overlaps between inputs sampled iid from the Gibbs distribution induced by the random MLP. For linear activations we solve these saddle point equations exactly. But we also solve them numerically for a variety of depths and activation functions, including tanh,sin,ReLU, and shaped non-linearities. We find even at infinite width a rich range of behaviors. For some non-linearities, such as sin, for instance, we find that the landscapes of random MLPs exhibit full replica symmetry breaking, while shallow tanh and ReLU networks or deep shaped MLPs are instead replica symmetric.
- Talk Slides (for IAIFI members only)
Ahmed Youssef, PhD Candidate, The University of Cincinnati
- Tuesday, April 15, 2025, 1:00pm–2:00pm, IAIFI Penthouse
- Learning to Hadronize: Machine Learning at the Frontier of Particle Physics
- Hadronization—the process by which quarks and gluons combine into observable hadrons—remains one of the most complex and least understood aspects of high-energy physics. Traditionally modeled through phenomenological frameworks, this non-perturbative transition introduces major uncertainties in collider simulations. This talk presents a machine learning perspective on the hadronization problem, highlighting recent advances in MLHad, a generative modeling framework for learning data-driven parton-to-hadron mappings. I will also introduce VISTAS, an interactive visualization tool for exploring the internal structure of simulated events, including parton showers and hadron formation. Together, these tools demonstrate how modern ML can enhance the flexibility and precision of simulations, offering new capabilities at the intersection of theory, data, and computation.
- Talk Slides (for IAIFI members only)
Denis Boyda (Meta AI, Former IAIFI Fellow), Emmanouil Theodosis (Grad Student, Harvard), Steven Eulig (Research Scientist, EMD Electronics and Harvard University), , organized by the Industry Partnership Committee
- Tuesday, April 1, 2025, 1:00pm–2:00pm, IAIFI Penthouse
- Career Advice Panel
- Three panelists in the IAIFI community will share about their experiences in academia and industry, and will discuss advice and tips for pursuing career goals. This will be followed by Q&A with attendees. Lunch and refreshments will be provided!
- Talk Slides (for IAIFI members only)
Dennis Noll, Postdoctoral Researcher, Berkeley Lab
- Tuesday, March 18, 2025, 1:00pm–2:00pm, IAIFI Penthouse
- Machine Learning-Driven Anomaly Detection in Dijet Events with ATLAS
- This talk will explore an anomaly detection search for narrow-width resonances beyond the Standard Model that decay into a pair of jets. Using 139 fb−1 of proton-proton collision data at sqrt(s) = 13 TeV, recorded from 2015 to 2018 with the ATLAS detector at the Large Hadron Collider, we aim to identify new physics without relying on a specific signal model. The analysis employs two machine learning strategies to estimate the background in different signal regions, with weakly supervised classifiers trained to differentiate this background estimate from actual data. We focus on high transverse momentum jets reconstructed as large-radius jets, using their mass and substructure as classifier inputs. After a classifier-based selection, we analyze the invariant mass distribution of the jet pairs for potential local excesses. Our model-independent results indicate no significant local excesses and we inject a representative set of signal models into the data to evaluate the sensitivity of our methods. This contribution discusses the used methods and latest results and highlights the potential of machine learning in enhancing the search for new physics in fundamental particle interactions.
- Talk Slides (for IAIFI members only)
Claudio Battiloro, Postdoctoral Fellow, Harvard T.H. Chan School of Public Health
- Tuesday, March 11, 2025, 1:00pm–2:00pm, IAIFI Penthouse
- E(n) Equivariant Topological Neural Networks
- In this journal club presentation, we review a paper that examines the limitations of conventional graph neural networks (GNNs) in modeling higher-order interactions and introduces topological deep learning (TDL) as a promising alternative. While GNNs excel at modeling pairwise interactions, they struggle to flexibly accommodate arbitrary multi-way, hierarchical interactions and features. TDL addresses this challenge by operating on combinatorial topological spaces, such as simplicial or cell complexes, instead of traditional graphs. However, little is known about how to leverage geometric features—such as positions and velocities—within TDL frameworks. The paper introduces E(n)-Equivariant Topological Neural Networks (ETNNs), which are E(n)-equivariant message-passing networks operating on combinatorial complexes that unify graphs, hypergraphs, simplicial, path, and cell complexes. ETNNs incorporate geometric node features while respecting rotation, reflection, and translation equivariance, and as TDL models, they are naturally suited for settings with heterogeneous interactions. The authors provide a theoretical analysis demonstrating the improved expressiveness of ETNNs over architectures for geometric graphs, and they derive E(n)-equivariant variants of TDL models directly from their framework. The broad applicability of ETNNs is showcased through two tasks: (i) molecular property prediction on the QM9 benchmark and (ii) land-use regression for hyper-local estimation of air pollution using multi-resolution irregular geospatial data. The results indicate that ETNNs are an effective tool for learning from diverse types of richly structured data, matching or surpassing state-of-the-art equivariant TDL models with a significantly smaller computational burden—thus highlighting the benefits of a principled geometric inductive bias. We will discuss these findings and their implications in today’s session.
- Talk Slides (for IAIFI members only)
Liu Ziyin, Postdoctoral Fellow, Research Laboratory of Electronics (MIT) and NTT Research
- Tuesday, March 4, 2025, 1:00pm–2:00pm, IAIFI Penthouse
- Symmetry and Hierarchies of Learning in AI Systems
- The dynamics of learning in modern large AI systems is hierarchical, often characterized by abrupt, qualitative shifts akin to phase transitions observed in physical systems. While these phenomena hold promise for uncovering the mechanisms behind neural networks and language models, existing theories remain fragmented, addressing specific cases. In this paper, we posit that parameter symmetry breaking and restoration serve as a unifying mechanism underlying these behaviors. We synthesize prior observations and show how this mechanism explains three distinct hierarchies in neural networks: learning dynamics, model complexity, and representation formation. By connecting these hierarchies, we highlight symmetry – a cornerstone of theoretical physics – as a potential fundamental principle in modern AI.
- Talk Slides (for IAIFI members only)
Nathan Suri, PhD Researcher, Yale University
- Tuesday, February 18, 2025, 1:00pm–2:00pm, IAIFI Penthouse
- WOTAN: Weakly-supervised Optimal Transport Attention-based Noise Mitigation
- We improve upon the existing literature on pileup mitigation techniques studied at Large Hadron Collider (LHC) experiments for disentangling proton-proton collisions. Pileup presents a salient problem that, if not checked, hinders the search for new physics and Standard Model precision measurements such as jet energy, jet substructure, missing momentum, and lepton isolation. The primary technique that serves as the foundation for this work is known as Training Optimal Transport using Attention Learning (TOTAL). The TOTAL methodology compares matched samples with and without pileup interactions present to robustly learn an accurate description of pileup as a transport function without any need for assumptions of pileup nature derived from simulations. In this work, we develop an improved version of TOTAL known as Weakly-supervised Optimal Transport Attention-based Noise Mitigation (WOTAN) by reducing the degree of TOTAL’s self-supervision. The reduction in self-supervision allows us to demonstrate the power of optimal transport-based pileup mitigation in being able to use data for particle classification instead of solely simulations. Despite its reduced supervision, our work still outperforms existing conventional pileup mitigation approaches by improving the resolution of key observables relevant for both precision measurements and BSM searches in events with pileup interaction counts up to 200. WOTAN is the first fully data-driven machine learning pileup mitigation strategy capable of operating at LHC experiments.
- Talk Slides (for IAIFI members only)
Ezequiel Alvarez, Professor, UNSAM (National University of General San Martín)
- Friday, January 31, 2025, 1:00pm–2:00pm, IAIFI Penthouse
- The Bayesian tack for the Information Frontier at the LHC
- Given the balance between upcoming and available luminosity at the LHC, extracting more information from the data than is currently being extracted has become a pivotal challenge in High Energy Physics. Although Machine Learning (ML) tools are widely adopted to address this challenge, many of these tools rely heavily on learning patterns from simulations. Simulations align well with the data in terms of individual distributions, but—as expected—their performance can be improved when it comes to reproducing correlations. Since Neural Networks are the perfect machines to capture correlations, they may erroneously learn artificial patterns as genuine and subsequently seek them out in real data. The Bayesian approach proposes an alternative set of ML algorithms that primarily leverage recent advancements from the field of Statistical Machine Learning. In this talk, we will explore how Bayesian ML tools and techniques offer a novel and promising avenue for extracting information from data. In this approach, there is a trade-off between simulations and modeling, emphasizing probabilistic modeling within a Bayesian ML framework. We will present the principles and methodologies of this Bayesian craft, highlighting how to unfold a physical system as a probabilistic model to inject relevant prior information in each corresponding latent variable, thereby improving the performance of the model. We will instantiate all the above on the expected advancements in processes such as pp > hh > bbbb, where Bayesian techniques demonstrate significant potential for enhanced results.
- Talk Slides (for IAIFI members only)

Fall 2024

Nate Woodward, Und6ergraduate Student, MIT
- Tuesday, December 10, 2024, 1:00pm–2:00pm, IAIFI Penthouse
- Product Manifold Machine Learning for Physics
- Physical data are representations of the fundamental laws governing the Universe, hiding complex compositional structures often well captured by hierarchical graphs. Hyperbolic spaces are endowed with a non-Euclidean geometry that naturally embeds those structures. To leverage the benefits of non-Euclidean geometries in representing natural data we develop machine learning on product manifold spaces, Cartesian products of constant curvature Riemannian manifolds. As a use case we consider the classification of ‘jets’, sprays of hadrons and other subatomic particles produced by the hadronization of quarks and gluons in collider experiments. We compare the performance of PM-MLP and PM-Transformer models across several possible representations. Our experiments show that product manifold representations generally perform equal or better to fully Euclidean models of similar size, with the most significant gains found for highly hierarchical jets and small models. These results reinforce the view of geometric representation as a key parameter in maximizing both performance and efficiency of machine learning on natural data.
- Talk Slides (for IAIFI members only)
Ibrahim Elsharkawy, Physics PhD Candidate, UIUC
- Tuesday, December 3, 2024, 1:00pm–2:00pm, IAIFI Penthouse
- Uncertainty Quantification from Scaling Laws in Deep Neural Networks
- Deep Learning algorithms, such as Neural Networks, have revolutionized function approximation tasks. However, it is not clear how to quantify neural-network-induced uncertainty on predictions as a function of network architecture, training algorithm, and initialization. We propose, in conjugation with infinite-width networks, to exploit scaling laws - an apparently ubiquitous phenomenon in deep learning where the test loss of a Neural Network (with a task-dependent scaling exponent) follows a power law with training set size. We find a potential and exciting “invariant” in Neural Network Ensemble statistics in both infinite-width and finite-width networks which may be directly useful for Uncertainty Quantification.
- Talk Slides (for IAIFI members only)
Mike Toomey, Postdoctoral Fellow, MIT
- Tuesday, November 26, 2024, 1:00pm–2:00pm, IAIFI Penthouse
- Learning Theory-Informed Priors for Bayesian Inference: A Case Study with Early Dark Energy
- Cosmological models are often formulated in the language of particle physics, using quantities like the axion decay constant, but tested against data using physical quantities such as energy density ratios, with uniform priors assumed on these quantities. This standard approach overlooks important theory-driven priors, including constraints from fundamental physics, like particle physics and string theory, which often favor sub-Planckian axion decay constants. In this talk, I will present a novel method for learning theory-informed priors for Bayesian inference using normalizing flows (NF), a flexible generative machine learning technique. NFs allows us to generate priors on model parameters in cases where analytic expressions are unavailable or difficult to compute. I’ll demonstrate this technique with an application to early dark energy (EDE), a model that has gained attention in the context of the Hubble tension. First, I’ll validate our NF-based approach using the limited theory-based constraints available for EDE, and then, leveraging the computational efficiency of NFs, I’ll showcase how we achieve some of the most stringent constraints on EDE when incorporating large-scale structure likelihoods. This talk will highlight the versatility of NFs in Bayesian inference in cosmology (and beyond) and how NFs and other generative machine learning techniques can help bridge the gap between theoretical models and data analysis.
- Talk Slides (for IAIFI members only)
Neill Warrington, Postdoc, MIT
- Tuesday, November 19, 2024, 1:00pm–2:00pm, IAIFI Penthouse
- Opportunities for QFT and optimization methods in superconducting quantum devices
- I will present recent work applying methods form quantum field theory and numerical optimization to superconducting quantum devices. I will argue that tools more typically used in particle physics - Feynman diagrams, path integrals, and quantum field theory - are useful for making better qubits.
- Talk Slides (for IAIFI members only)
Thomas Helfer, Research Fellow, Institute for Advanced Computational Science at Stony Brook
- Tuesday, November 12, 2024, 1:00pm–2:00pm, IAIFI Penthouse
- Super-Resolution without High-Resolution Labels for Black Hole Simulations
- Generating high-resolution simulations is key for advancing our understanding of one of the universe’s most violent events: Black Hole mergers. However, generating Black Hole simulations is limited by prohibitive computational costs and scalability issues, reducing the simulation’s fidelity and resolution achievable within reasonable time frames and resources. In this work, we introduce a novel method that circumvents these limitations by applying a super-resolution technique without directly needing high-resolution labels, leveraging the Hamiltonian and momentum constraints—fundamental equations in general relativity that govern the dynamics of spacetime. We demonstrate that our method achieves a reduction in constraint violation by one to two orders of magnitude and generalizes effectively to out-of-distribution simulations.
- Talk Slides (for IAIFI members only)
Konstantin Leyde, Research Fellow, University of Portsmouth
- Tuesday, October 29, 2024, 1:00pm–2:00pm, IAIFI Penthouse
- Gravitational wave populations and cosmology with neural posterior estimation
- Compact binary coalescences that are observed through gravitational waves (GWs) provide an independent probe to constrain the current expansion rate of the Universe, H0. In addition to the information on the luminosity distance that is directly provided from the GW, redshift information is also needed for the H0 measurement. All GW events observed thus far (with the exception of the binary neutron star merger GW170817) do not have electromagnetic counterparts, and other methods are needed to provide this redshift information. I will summarize a method that uses the (redshifted) source frame mass distribution, also known as the mass spectrum method. Since this approach requires no additional electromagnetic redshift, the mass spectrum method allows to constrain H0 from all observed GW sources so far. In my talk, I will summarize how one can apply neural posterior estimation for fast-and-accurate hierarchical Bayesian inference of gravitational wave populations. We use a normalizing flow to estimate directly the population hyper-parameters from a collection of individual source observations. This approach provides complete freedom in event representation, automatic inclusion of selection effects, and (in contrast to likelihood estimation) without the need for stochastic samplers to obtain posterior samples. Since the number of events may be unknown when the network is trained, we split our analysis into sub-population analyses that we later recombine; this allows for fast sequential analyses as additional events are observed. We demonstrate our method on a toy problem within the mass spectrum method desctibed above, and show that inference takes just a few minutes and scales to 600 events before performance degrades. I will argue that neural posterior estimation therefore represents a promising avenue for population inference with large numbers of GW events.
- Talk Slides (for IAIFI members only)
Kayla DeHolton, , Penn State University
- Tuesday, October 22, 2024, 1:00pm–2:00pm, IAIFI Penthouse
- Graph Neural Networks for GeV neutrinos in IceCube
- The IceCube Neutrino Observatory, located at the South Pole, is a multi-purpose detector for particle physics and multimessenger astrophysics. It has been collecting data for more than a decade and in recent years, several major discoveries and measurements have utilized neural networks. The convolution neural network frameworks used in these analyses are limited by the irregular geometry of the detector and the sparsity of detected photons in GeV-scale neutrino events. These limitations will become even more difficult to work around with future detectors like the IceCube Upgrade, a low energy extension to be constructed in the 2025-2026 Antarctic season. New efforts with graph neural networks have already shown promising improvements, unlocking new opportunities for future analyses and paving the way for exciting new discoveries.
- Talk Slides (for IAIFI members only)
Keiya Hirashima, , University of Tokyo
- Tuesday, October 15, 2024, 1:00pm–2:00pm, IAIFI Penthouse
- Surrogate Modeling for Supernova Feedback toward Star-by-star Simulations of Milky-Way-sized Galaxies
- In recent decades, improvements in galaxy simulations have revealed the interdependence of multiscale gas physics, such as star formation and feedback processes. Still, so-called sub-grid models have been widely used due to limited resolution and scalability. Even in zoom-in simulations, the mass resolution is capped at around 1,000 solar masses. To address this, we are developing the ASURA-FDPS code to leverage exascale computing for simulating individual stars and feedback in galaxies. Challenges in scalability arise from localized short-timescale events like supernovae (SNe). To overcome this, we developed a machine-learning-based surrogate model that predicts SNe feedback 100,000 years ahead, reducing the computational cost to 1% of direct resolution. This presentation discusses the fidelity and progress of high-resolution galaxy simulations using this approach.
- Talk Slides (for IAIFI members only)
Felix Yu, Graduate Student, Harvard University
- Tuesday, October 8, 2024, 1:00pm–2:00pm, IAIFI Penthouse
- Learning Representations & Super-Resolution of Neutrino Telescope Events
- Neutrino telescopes detect rare interactions of particles produced in some of the most extreme environments in the Universe. This is accomplished by instrumenting a cubic-kilometer volume of naturally occurring transparent medium with light sensors. Given their substantial size and the high frequency of background interactions, these telescopes amass an enormous quantity of large variance, high-dimensional data. These attributes create substantial challenges for analyzing and reconstructing interactions, particularly when utilizing machine learning (ML) techniques. In this talk, I will present a novel approach that employs transformer-based variational autoencoders to efficiently represent neutrino telescope events by learning compact and descriptive latent representations. I will also talk about some potential applications which can take advantage of these more flexible and efficient representations, including super-resolving neutrino telescope events to enhance reconstruction performance.
- Talk Slides (for IAIFI members only)
Kit Fraser-Taliente, Graduate Student, University of Oxford
- Tuesday, October 1, 2024, 1:00pm–2:00pm, IAIFI Penthouse
- Computation of Quark Masses in String Theory
- We present a numerical computation, based on neural network techniques, of the physical Yukawa couplings in a heterotic string theory model obtained after compactification on a Calabi-Yau threefold. I consider examples from a large class of models with precisely the MSSM low-energy spectrum, plus fields uncharged under the standard-model group. Suitable neural networks are used to compute the relevant quantities. I will discuss the general problem of learning functions on manifolds, equivariant neural networks, and generalisation to other models and constructions.
- Talk Slides (for IAIFI members only)
Rikab Gambhir, Graduate Student, MIT
- Tuesday, September 24, 2024, 1:00pm–2:00pm, IAIFI Penthouse
- Moments Of Clarity in Machine Learning for Jet Physics
- Machine learning models have shown incredible promise for science, especially for physics at the Large Hadron Collider (LHC), through their ability to extract information from huge amounts of data. However, as physicists, we often desire to have precise control of the information input and output of a model, both to improve interpretability and to guarantee properties of interest in our problems. In this talk, I go over three different examples from my work in jet physics at the LHC where targeted and goal-motivated model design and loss function choice can be used to control the extracted information in machine learning models. In particular, I discuss how task-engineered network architectures and losses can be used to extract provably prior-independent and unbiased resolutions for calibrations at the LHC, how they can be used to construct a new class of robust observables for jets, and how they can be used to streamline latent spaces using elementary functions for interpretability.
- Talk Slides (for IAIFI members only)

Spring 2024

Radha Mastandrea, Grad Student, University of California, Berkeley
- Tuesday, April 30, 2024, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- A Survey of Machine Learning Methods for Anomaly Detection
- Machine learning-based anomaly detection (AD) methods are promising tools for extending the coverage of searches for physics beyond the Standard Model (BSM). I will first talk about a class of AD methods for “resonant anomaly detection”, where the BSM is assumed to be localized in at least one known variable. There have been many methods proposed to identify such a BSM signal that make use of simulated or detected data in different ways, so I will discuss their complementarity – even if their maximum performance is the same, it may be beneficial more generally to combine approaches. I will then go over a class of AD methods for “nonresonant” detection, where the BSM may arise from off-shell effects or final states with significant missing energy. Using a semi-visible jet signature as a benchmark signal model, I will show that these methods can automatically identify anomalous events, elevating rare nonresonant signal models to the detection threshold.
- Talk Slides (for IAIFI members only)
Akshunna Dogra, Graduate Student, Imperial College London
- Tuesday, April 23, 2024, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Many-Fold Learning
- Machine learning (ML) has been profitably leveraged across a wide variety of problems in recent years. Empirical observations show that ML models from suitable functional spaces are capable of adequately efficient learning across a wide variety of disciplines. In this work (first in a planned sequence of three), we build the foundations for a generic perspective on ML model optimization and generalization dynamics. Specifically, we prove that under variants of gradient descent, “well-initialized” models solve sufficiently well-posed problems at extit{a priori} or extit{in situ} determinable rates. Notably, these results are obtained for a wider class of problems, loss functions, and models than the standard mean squared error and large width regime that is the focus of conventional Neural Tangent Kernel (NTK) analysis. The $ u$ - Tangent Kernel ($ u$TK), a functional analytic object reminiscent of the NTK, emerges naturally as a key object in our analysis and its properties function as the control for learning. We exemplify the power of our proposed perspective by showing that it applies to diverse practical problems solved using real ML models, such as classification tasks, data/regression fitting, differential equations, shape observable analysis, etc. We end with a small discussion of the numerical evidence, and the role $ u$TKs may play in characterizing the search phase of optimization, which leads to the “well-initialized” models that are the crux of this work.
- Talk Slides (for IAIFI members only)
Marisa LaFleur, Project Manager, IAIFI
- Tuesday, April 2, 2024, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Managing Time and Influencing People
- Taking a break from our regularly scheduled journal club programming, the Industry Partnership Committee have requested a crash course in project management for academics. I’ll share some time management and communication tips and tricks to elevate your project management skills and increase efficiency, leaving more time for research. We’ll leave time for questions, so come with all of your organizational concerns!
- Talk Slides (for IAIFI members only)
Katherine Fraser, Graduate Student, Harvard University
- Tuesday, March 19, 2024, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Combining Energy Correlators with Machine Learning
- Energy correlators, which are correlation functions of the energy flow operator, are theoretically clean observables which can be used to improve various measurements. In this talk, we discuss ongoing work exploring the benefits of combining them with Machine Learning for precisely measuring the Top-quark mass.
- Talk Slides (for IAIFI members only)
Kehang Zhu, Grad Student, Harvard
- Tuesday, March 12, 2024, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Agent-based modeling: Harnessing Large Language Models for Automated Exploration of Emergent Behaviors in Simulated Social Systems
- Two significant impediments to success of the social sciences in comparison to physics are the inherent difficulty in both rapidly executing multiple controlled experiments to explore a parameter space and determining what parameter space to explore. In this work, we present a computational framework and platform that simulates the entire social scientific process, leveraging Large Language Models (LLMs) to study human actors within social systems. We create controlled environments, akin to toy models in physics, that systematically explore the space parameter of variables relevant to any social system (such as attributes of human actors), allowing for the exponentially faster discovery of emergent social behaviors as compared to traditional social science experimentation. Central to our approach is the automatic generation of Structural Causal Models (SCMs) that generate statistical correlations of potential interactions within a social system and outline the requisite metrics and tools to observe and measure these nonlinear dynamics. With the flexibility to vary controlled variables across a nearly infinite parameter space, our system offers a sandbox to simulate and analyze various social scenarios – from wage bargaining and auction mechanics to nuclear weapon negotiations. Our framework and platform offers a new playground for physicists to study the nonlinear dynamics and emergent phenomena in human social systems.
- Talk Slides (for IAIFI members only)
Jonas Rigo, Postdoc, Forschungszentrum Jülich GmbH
- Tuesday, February 27, 2024, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Is the ground state of Anderson’s impurity model a recurrent neural network?
- When the Anderson impurity model (AIM) is expressed in terms of a Wilson chain it assumes a hierarchical Renormalization group structure that translates to a ground state with features like Friedel oscillations and the Kondo screening cloud. Recurrent neural networks (RNNs) have recently gained traction in the form of Neural Quantum States (NQS) ansätze for quantum many body ground states and they are known to be able to learn such complex patterns. We explore RNNs as an ansatz to capture the AIM’s ground state for a given Wilson chain length and investigate its capability to predict the ground state on longer chains for a converged ground state energy.
- Talk Slides (for IAIFI members only)
Darius Faroughy, Postdoctoral Associate, Rutgers University
- Tuesday, February 20, 2024, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Is flow-matching an alternative to diffusion?
- We discuss flow-matching (2210.02747), a recently proposed objective for training continuous normalizing flows inspired by diffusion models. As a generative model, flow-matching can produce state-of-the-art samples for images and other data representations. More interestingly, flow-matching can be used to go beyond generative modeling by learning to approximate the optimal transport map between two arbitrary data distributions. The JC is meant to be an interactive blackboard talk discussing the method. At the end, I’ll flash a few slides illustrating its usefulness for generating jets as particle clouds (2310.00049).
Helen Qu, Grad Student, University of Pennsylvania
- Tuesday, February 6, 2024, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Enabling precision photometric SN Ia cosmology with machine learning
- The discovery of the accelerating expansion of the universe has led to increasing interest in probing the nature of dark energy. As very bright standardizable candles, type Ia supernovae (SNe Ia) are used to measure precise distances on cosmological scales and thus have been instrumental to this effort. Building a robust dataset of SNe Ia across a wide range of redshifts will allow for the construction of an accurate Hubble diagram, enrich our understanding of the expansion history of the universe, as well as place constraints on the dark energy equation of state. However, much of our analysis pipeline will be overwhelmed by the data deluge of the LSST era. In this talk, I will present recent improvements on two key pieces of SN Ia cosmology analysis: the purity of the photometric SNe Ia sample and the redshift identification accuracy for these SNe. To address the SNe Ia purity problem, I will present SCONE (Supernova Classification with a Convolutional Neural Network), a deep learning-based approach to early and full lightcurve photometric SN classification. On the redshift estimation front, I will present work on characterizing inaccurate redshifts due to SN host galaxy mismatch and its effect on cosmology, as well as Photo-zSNthesis, a machine learning algorithm that uses SN photometry to directly estimate redshift. As long as logistical challenges prevent the spectroscopic follow-up of most detected SNe, a reliable photometric SN classification algorithm and redshift estimation strategy will allow us to tap into the vast potential of the photometric dataset.
- Talk Slides (for IAIFI members only)
Alex Malz, LINCC Frameworks Project Scientist, Carnegie Mellon University
- Tuesday, January 16, 2024, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Data processing challenges for real-time observational astrophysics
- Astronomical transient and variable events comprise the things that go boom in the night, or otherwise vary in brightness or color over time, and are among the most powerful phenomena of the universe, providing a window into energy scales inaccessible to any laboratory on Earth. The fundamental physics determining the time-series light curves of these astronomical objects, which include exploding stars and black hole mergers, is key to understanding the nature of the dark energy driving the accelerating expansion of the universe, the dark matter guiding the formation and clustering of massive structures, and ultimately our place in the cosmos. During its ten-year mission beginning in 2025, the Legacy Survey of Space and Time (LSST) on the Vera C. Rubin Observatory will observe hundreds of millions of such transient and variable sources, up from the mere millions known to date, by making a ten-year 3D movie of the night sky. In doing so, it will revolutionize astronomy with a deluge of data that could enable boundless discoveries, conditioned on meeting the challenges of the data’s nontrivial noise properties; the scale of the anticipated data is a direct corollary to the strategy of collecting less informative photometric data rather than high-fidelity, resource-intensive spectroscopy. In this talk, I will introduce open problems and evolving data-driven solutions for several interesting aspects of the systems for processing and interpreting the anticipated data.
- Talk Slides (for IAIFI members only)

Fall 2023

Neill Warrington, Postdoc, MIT
- Tuesday, November 28, 2023, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Thimbology and The Sign Problem
- I will talk about thimbology, a technique for taming sign problems in lattice field theory, where the domain of integration of path integral is deformed into complex field space. Machine learning contours proves useful for certain problems and is now a common technique. I’ll review the idea for a general audience, then share some recent results.
- Talk Slides (for IAIFI members only)
Zeviel Imani, Graduate Student, Tufts
- Tuesday, November 14, 2023, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Score-based Diffusion Models for Generating LArTPC Images
- Modern generative modeling has demonstrated remarkable success in the realm of natural images. However, these approaches do not necessarily generalize to all image domains. In neutrino physics experiments, our Liquid Argon Time Project Chamber (LArTPC) particle detectors produce images that are globally sparse but locally dense. We have found that some generation algorithms, such as GANs and VQ-VAE, are unable to reproduce these image characteristics. Recently, we have successfully generated high-fidelity images of track and shower particle event types using a score-based diffusion model. In this talk, I will outline the methodology underlying this type of model, explore our quality metrics for these generated images, and discuss planned extensions and applications of this work.
- Talk Slides (for IAIFI members only)
Jose Miguel Munoz Arias, Graduate Student, MIT
- Tuesday, November 7, 2023, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- lie-nn: Pioneering lie G-equivariant Neural Networks for Cross-Domain Scientific Applications
- This talk explores a novel Equivariant Neural Network architecture that respects symmetries of finite-dimensional representations of any reductive Lie Group G. These groups span several scientific domains, from high energy physics to computer vision. We extend ACE and MACE frameworks to data equivariant to a reductive Lie group action. We present lie-nn, a software library for building G-equivariant neural networks that simplifies the application to varied problems by decomposing tensor products into irreducible representations. We illustrate the adaptability and effectiveness of our approach with top quark decay tagging and shape recognition applications. We demonstrate that acknowledging these symmetries can boost prediction accuracy while using less training data. Our study represents a significant step towards generating interactive representations of geometric point clouds, offering a fresh problem-solving framework across scientific fields.
- Talk Slides (for IAIFI members only)
Thorsten Glüsenkamp, Postdoc, Uppsala University
- Tuesday, October 31, 2023, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Conditional normalizing flows for IceCube event reconstruction
- In this seminar, I will talk about normalizing flows (NFs), in particular about the types that are useful for high-energy neutrino event reconstruction in IceCube. First, I will give an introduction that focuses on essentially two different classes of flows which have quite a citation disparity in the literature: 1) normalizing flows in high dimensions (D>~100), which typically have high citation counts, and 2) normalizing flows in low dimensions (D = 1 - 100), which are typically cited less frequently. I discuss the reasons why I think this latter class, which is often less known, is in particular useful for high-energy physicists, and then briefly review two examples of that class: specific Gaussianization flows (2003.01941) , and exponential-map flows (0906.0874/2002.02428). Finally, I discuss a recent application of these particular flows as conditional NFs for neutrino event econstruction in the IceCube detector (2309.16380).
- Talk Slides (for IAIFI members only)
Ryan Raikman, Undergraduate, Carnegie Mellon University (currently working with LIGO), MIT LIGO
- Tuesday, October 24, 2023, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- GWAK: Gravitational Wave Anomalous Knowledge with Recurrent Autoencoders
- Matched-filtering detection techniques for gravitational-wave (GW) signals in ground-based interferometers rely on having well-modeled templates of the GW emission. Such techniques have been traditionally used in searches for compact binary coalescences (CBCs), and have been employed in all known GW detections so far. However, interesting science cases aside from compact mergers do not yet have accurate enough modeling to make matched filtering possible, including core-collapse supernovae and sources where stochasticity may be involved. Therefore the development of techniques to identify sources of these types is of significant interest. In this paper, we present a method of anomaly detection based on deep recurrent autoencoders to enhance the search region to unmodeled transients. We use a semi-supervised strategy that we name Gravitational Wave Anomalous Knowledge (GWAK). While the semi-supervised nature of the problem comes with a cost in terms of accuracy as compared to supervised techniques, there is a qualitative advantage in generalizing experimental sensitivity beyond pre-computed signal templates. We construct a low-dimensional embedded space using the GWAK method, capturing the physical signatures of distinct signals on each axis of the space. By introducing signal priors that capture some of the salient features of GW signals, we allow for the recovery of sensitivity even when an unmodeled anomaly is encountered. We show that regions of the GWAK space can identify CBCs, detector glitches and also a variety of unmodeled astrophysical sources.
- Talk Slides (for IAIFI members only)
Andy Jin, Graduate Student, Harvard University
- Tuesday, October 17, 2023, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Two Watts is All You Need: Low-Power Machine Learning on TPU for Neutrino Telescopes
- In neutrino experiments, we have seen machine learning software methods to boost our abilities of physics discovery given the hardware experimental setups. Currently, we face upgrades and new telescopes and experimental hardware expecting more statistics as well as more complicated data signals. This calls out for an upgrade on the software side as well for handling the more complex data in a more efficient way. Specifically, we need low power and fast software methods in order to achieve real time signal processing, where current machine learning base methods are too expensive to be deployed in the commonly power-restricted regions where these experiments are located. In this talk, I will present the first attempt at and a proof of concept for enabling machine learning methods to be deployed live in under water/ice neutrino telescopes via quantization and deployment on Tensor Processing Units (TPUs). We use an LSTM-based recursive neural network with residual convolution-based data encoding, combined with specifically tailored data pre-processing and quantization aware training methods for deployment on the Google Edge TPU. This algorithm can achieve state-of-the-art angular resolution in reconstruction with a real-time inference frequency of 100 Hz/Watts in a TPU accelerator at only 2 Watts of power consumption. This opens up a world of chances to integrate machine learning capacity into detectors and electronics deep into even the most power-restricted environments.
- Talk Slides (for IAIFI members only)
Manos Theodosis, Graduate Student, Harvard
- Tuesday, October 3, 2023, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Learning Group Representations in Neural Networks
- Employing equivariance in neural networks leads to greater parameter efﬁciency via parameter sharing and improved generalization performance through the encoding of domain knowledge in the architecture; however, the majority of existing approaches require an a priori speciﬁcation of the data symmetries. We present a neural network architecture, Group Representation Networks (GRNs), that learns symmetries on the weight space of neural networks without any supervision or knowledge of the hidden symmetries in the data. Beyond their interpretability, GRNs’ learned representations distill symmetries of the data domain and the downstream task, which are incorporated when training networks on different datasets. The key idea behind GRNs relates weights in neural networks via a cyclic action whose group representation depends on the data domain, and is learned in an unsupervised manner. Our experiments underline the ability of GRNs to correctly recover symmetries in the data, show competitive performance when GRNs are used as a drop-in replacement for conventional layers, and highlight the ability to transfer learned representations across tasks and datasets.
- Talk Slides (for IAIFI members only)
Jeffrey Lazar, Graduate Student, Harvard
- Tuesday, September 26, 2023, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Open-Source Simulation and Machine Learning for Neutrino Telescopes
- In the last decade, the filed of neutrino astronomy has made major strides, culminating of the definitive detection of galactic and extragalactic components of the astrophysical neutrino flux. We can now begin characterizing these astrophysical beams and pursuing new physics through them. Machine learning techniques have played an integral part in these recent advances, and while these current efforts have been impressive, it is clear that there is room to improve. This face, along with the growing, global network of neutrino telescopes, drives the need for open-source tools to use all person power and avoid reduplicating effort. In this talk I will present Prometheus, the first open-source, end-to-end simulation for neutrino telescopes. Furthermore, I will show a recent example of Prometheus to develop machine learning techniques capable of running at typical neutrino telescope trigger rates.
- Talk Slides (for IAIFI members only)
Tony Menzo, Graduate Assistant, University of Cincinnati
- Tuesday, September 19, 2023, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Towards a data-driven model of hadronization
- We will discuss recent and ongoing developments at the intersection of machine learning and simulated hadronization. Specifically, we’ll focus on some of the major challenges presented when attempting to build a data-driven model of hadronization that utilizes experimental data during training. Solutions to some of these challenges will be presented in the context of invertible neural networks or normalizing flows including the introduction of a new paradigm that allows for the training of microscopic hadronization dynamics from macroscopic event-level observables.
- Talk Slides (for IAIFI members only)

Spring 2023

Ziming Liu, Grad Student, MIT
- Tuesday, April 25, 2023, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Physics-inspired generative models
- It might be surprising and delightful to physicists that physics has been playing a huge role in diffusion models. In fact, the evolution of our physical world can be viewed as a generation process. In this journal club, I will first review diffusion models, the more recent PFGM/PFGM++ inspired from electrostatics, and then introduce the GenPhys framework which manages to convert even more physical processes to generative models.
- Talk Slides (for IAIFI members only)
Di Luo, Postdoctoral Fellow, IAIFI
- Tuesday, April 25, 2023, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Multi-legged Robot Locomotion via Spin Models Duality
- Contact planning is crucial in locomoting systems. Specifically, appropriate contact planning can enable versatile behaviors (e.g., sidewinding in limbless locomotors) and facilitate speed-dependent gait transitions (e.g., walk-trot-gallop in quadrupedal locomotors). The challenges of contact planning include determining not only the sequence by which contact is made and broken between the locomotor and the environments, but also the sequence of internal shape changes (e.g., body bending and limb shoulder joint oscillation). Most state-of-art contact planning algorithms focused on conventional robots (e.g. biped and quadruped) and conventional tasks (e.g. forward locomotion), and there is a lack of study on general contact planning in multi-legged robots. In this talk, I am going to discuss that using geometric mechanics framework, we can obtain the global optimal contact sequence given the internal shape changes sequence. Therefore, we simplify the contact planning problem to a graph optimization problem to identify the internal shape changes. Taking advantage of the spatio-temporal symmetry in locomotion, we map the graph optimization problem to special cases of spin models, which allows us to obtain the global optima in polynomial time. We apply our approach to develop new forward and sidewinding behaviors in a hexapod and a 12-legged centipede. We verify our predictions using numerical and robophysical models, and obtain novel and effective locomotion behaviors.
- Talk Slides (for IAIFI members only)
Asem Wardak, Research Fellow, Harvard
- Tuesday, April 11, 2023, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Extended Anderson Criticality in Heavy-Tailed Neural Networks
- This talk focuses on nonlinearly interacting systems with random, heavy-tailed connectivity. We show how heavy-tailed connectivity gives rise to an extended critical regime of spatially multifractal fluctuations between the quiescent and active phases. This phase differs from the edge of chaos in classical networks by the appearance of universal hallmarks of the Anderson transition in condensed matter physics over an extended region in phase space. We then investigate some consequences of the multifractal Anderson regime for performing persistent computations.
- Talk Slides (for IAIFI members only)
Joshua Villarreal, Grad Student, MIT
- Tuesday, April 4, 2023, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Surrogate Modeling of Particle Accelerators
- The design, construction, and fine-tuning of particle accelerators has never been easy. Each is a technical challenge in and of itself, and the need to repeatedly run accurate, high-fidelity simulations of the beam traversing the device can slow development. This is especially true for many modern-day particle accelerators, whose beam dynamics tend to observe more nonlinear effects like those arising from space charge, making their simulation more computationally expensive. Thus, there is demand to develop machine learning and statistical learning models that can reproduce these beam dynamic simulations with orders-of-magnitude improvements in runtime. In this talk, I present an overview of recent efforts to build such accelerator surrogate models, which can be used for the design optimization and real-time commissioning, tuning, and running of the accelerator they aim to replicate. As an example, I also present the status of IsoDAR’s work to build a surrogate model for a Radio-Frequency Quadrupole accelerator, a vital component to IsoDAR’s groundbreaking design. I outline challenges of these and other virtual accelerators, and present future plans to make these surrogate models ubiquitous in future development of accelerator experiments of all kinds.
- Talk Slides (for IAIFI members only)
Daniel Murnane, Postdoc Researcher, Berkeley Lab
- Tuesday, March 21, 2023, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Multi-Tasking ML for Point Clouds at the LHC
- The Large Hadron Collider is one of the world’s most data-intensive experiments. Every second, millions of collisions are processed, each one resembling a jigsaw puzzle with thousands of pieces. With the upcoming upgrade to the High Luminosity LHC, this problem will only become more complex. To make sense of this data, deep learning techniques are increasingly being used. For example, graph neural networks and transformers have proven effective at handling point cloud tasks such as track reconstruction and jet tagging. In this talk, I will review the point cloud problems in collider physics and recent deep learning solutions investigated by the Exatrkx project - an initiative to implement innovative algorithms for HEP at exascale. These architectures can accurately perform tracking and tagging with low latency, even in the high luminosity regime. Additionally, I will explore how multi-tasking and multi-modal networks can combine several of these different tasks.
- Talk Slides (for IAIFI members only)
Manuel Szewc, Postdoc, University of Cincinnati
- Tuesday, March 14, 2023, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Modeling Hadronization with Machine Learning
- A fundamental part of event generation, hadronization is currently simulated with the help of fine-tuned empirical models. In this talk, I’ll present MLHAD, a proposed alternative for hadronization where the empirical model is replaced by a surrogate Machine Learning-based model to be ultimately data-trainable. I’ll detail the current stage of development and discuss possible ways forward.
- Talk Slides (for IAIFI members only)
Max Tegmark, Professor, MIT
- Tuesday, February 28, 2023, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Mechanistic interpretability
- Mechanistic interpretability aims to reverse-engineer trained neural networks to distill out the algorithms they have discovered for performing various tasks. Although such ‘artificial neuroscience’ is hard and fun, it’s easier than conventional neuroscience since you have complete knowledge of what every neuron and synapse is doing.
- Talk Slides (for IAIFI members only)
Liping Liu, Assistant Professor, Tufts University
- Tuesday, February 14, 2023, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Address combinatorial graph problems with learning methods
- There are plenty of hard combinatorial problems defined on graphs. Recently learning algorithms have been used to speed up the search for approximate solutions to these problems. This talk will start with an introduction to hard problems on graphs and traditional algorithms, then it will give an overview of learning algorithms for solving combinatorial problems on graphs. The second part of the talk will focus on two specific problems, graph matching and subgraph distance calculation, and discuss neural methods for these two problems. Finally, it will conclude with open questions: why and when can neural networks help to solve combinatorial problems?
- Talk Slides (for IAIFI members only)

Fall 2022

Anna Golubeva, IAIFI Fellow and Matt Schwartz, Professor, Harvard
- Tuesday, November 29, 2022, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Should artificial intelligence be interpretable to humans?
- Talk Slides (for IAIFI members only)
Michael Toomey, PhD Student, Brown University
- Tuesday, November 15, 2022, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Deep Learning the Dark Sector
- One of the most pressing questions in physics today is the microphysical origin of dark matter. While there have been numerous experimental programs aimed at detecting its interactions with the Standard Model, all efforts to date have come up empty. An alternative method to constrain dark matter is purely based on its gravitational interactions. In particular, gravitational lensing can be very sensitive to the distribution and morphology of dark matter substructure which can vary appreciably between different models. However, the complexity of data sets, systematics, and large volumes of data make the dimensionality of this problem difficult to approach from more traditional methods. Thankfully, this is a task ideally suited for machine learning. In this talk we will demonstrate how machine learning will play a critical role in distinguishing between models of dark matter and constraining model parameters in lensing data. We will additionally discuss techniques unique to ML for transferring the knowledge accumulated by models in the controlled setting of simulation to real data sets utilizing unsupervised domain adaptation.
- Talk Slides (for IAIFI members only)
Ziming Liu, PhD Student, MIT
- Tuesday, November 8, 2022, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Toy Models of Superposition
- It would be very convenient if the individual neurons of artificial neural networks corresponded to cleanly interpretable features of the input. For example, in an “ideal” ImageNet classifier, each neuron would fire only in the presence of a specific visual feature, such as the color red, a left-facing curve, or a dog snout. But it isn’t always the case that features correspond so cleanly to neurons, especially in large language models where it actually seems rare for neurons to correspond to clean features. I will present a recent paper ‘Toy Models of Superposition’ from Anthropic, aiming to answer these questions: Why is it that neurons sometimes align with features and sometimes don’t? Why do some models and tasks have many of these clean neurons, while they’re vanishingly rare in others?
- Talk Slides (for IAIFI members only)
Sona Najafi, Researcher, IBM
- Tuesday, October 25, 2022, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Quantum machine learning from algorithms to hardware
- The rapid progress of technology over the past few decades has led to the emergence of two powerful computational paradigms known as quantum computing and machine learning. While machine learning tries to learn the solutions from data, quantum computing harnesses the quantum laws for more powerful computation compared to classical computers. In this talk, I will discuss three domains of quantum machine learning, each harnessing a particular aspect of quantum computers and targeting specific problems. The first domain scrutinizes the power of quantum computers to work with high-dimensional data and speed-up algebra, but raises the caveat of input/output due to the quantum measurement rules. The second domain circumvents this problem by using a hybrid architecture, performing optimization on a classical computer while evaluating parameterized states on a quantum circuit, chosen based on a particular issue. Finally, the third domain is inspired by brain-like computation and uses a given quantum system’s natural interaction and unitary dynamic as a source for learning.
Kim Nicoli, Grad Student, Technical University of Berlin
- Tuesday, October 18, 2022, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Deep Learning approaches in lattice quantum field theory: recent advances and future challenges
- Normalizing flows are deep generative models that leverage the change of variable formula to map simple base densities to arbitrary complex target distributions. Recent works have shown the potential of such methods in learning normalized Boltzmann densities in many fields ranging from condensed matter physics to molecular science to lattice field theory. Though sampling from a flow-based density comes with many advantages over standard MCMC sampling, it is known that these methods still suffer from several limitations. In my talk, I will start to give an overview on how to deploy deep generative models to learn Boltzmann densities in the context of a phi^4 lattice field theory. Specifically, I’ll focus on how these methods open up the possibility to estimate thermodynamic observables, i.e., physical observables which depend on the partition function and hence are not straightforward to estimate using standard MCMC methods. In the second part of my talk, I will present two ideas that have been proposed to mitigate the well-known problem of mode-collapse which often occurs when normalizing flows are trained to learn a multimodal target density. More specifically I’ll talk about a novel “mode-dropping estimator” and path gradients. In the last part of my talk, I’ll present a new idea which aims at using flow-based methods to mitigate the sign problem.
- Talk Slides (for IAIFI members only)
Adriana Dropulic, Grad Student, Princeton
- Tuesday, October 4, 2022, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Machine Learning the 6th Dimension: Stellar Radial Velocities from 5D Phase-Space Correlations
- The Gaia satellite will observe the positions and velocities of over a billion Milky Way stars. In the early data releases, most observed stars do not have complete 6D phase-space information. We demonstrate the ability to infer the missing line-of-sight velocities until more spectroscopic observations become available. We utilize a novel neural network architecture that, after being trained on a subset of data with complete phase-space information, takes in a star’s 5D astrometry (angular coordinates, proper motions, and parallax) and outputs a predicted line-of-sight velocity with an associated uncertainty. Working with a mock Gaia catalog, we show that the network can successfully recover the distributions and correlations of each velocity component for stars that fall within ~5 kpc of the Sun. We also demonstrate that the network can accurately reconstruct the velocity distribution of a kinematic substructure in the stellar halo that is spatially uniform, even when it comprises a small fraction of the total star count. We apply the neural network to real Gaia data and discuss how the inferred information augments our understanding of the Milky Way’s formation history.
- Talk Slides (for IAIFI members only)
Iris Cong, Grad Student, Harvard
- Tuesday, September 27, 2022, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Quantum Convolutional Neural Networks
- Convolutional neural networks (CNNs) have recently proven successful for many complex applications ranging from image recognition to precision medicine. In the first part of my talk, motivated by recent advances in realizing quantum information processors, I introduce and analyze a quantum circuit-based algorithm inspired by CNNs. Our quantum convolutional neural network (QCNN) uses only O(log(N)) variational parameters for input sizes of N qubits, allowing for its efficient training and implementation on realistic, near-term quantum devices. To explicitly illustrate its capabilities, I show that QCNN can accurately recognize quantum states associated with a one-dimensional symmetry-protected topological phase, with performance surpassing existing approaches. I further demonstrate that QCNN can be used to devise a quantum error correction (QEC) scheme optimized for a given, unknown error model that substantially outperforms known quantum codes of comparable complexity. The design of such error correction codes is particularly important for near-term experiments, whose error models may be different from those addressed by general-purpose QEC schemes. If time permits, I will also present our latest results on generalizing the QCNN framework to more accurately and efficiently identify two-dimensional topological phases of matter.
- Talk Slides (for IAIFI members only)
Miles Cranmer, Grad Student, Princeton
- Tuesday, September 20, 2022, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Interpretable Machine Learning for Physics
- Would Kepler have discovered his laws if machine learning had been around in 1609? Or would he have been satisfied with the accuracy of some black box regression model, leaving Newton without the inspiration to find the law of gravitation? In this talk I will present a review of some industry-oriented machine learning algorithms, and discuss a major issue facing their use in the natural sciences: a lack of interpretability. I will then outline several approaches I have created with collaborators to help address these problems, based largely on a mix of structured deep learning and symbolic methods. This will include an introduction to the PySR software (https://astroautomata.com/PySR), a Python/Julia package for high-performance symbolic regression. I will conclude by demonstrating applications of such techniques and how we may gain new insights from such results.
- Talk Slides (for IAIFI members only)
Anindita Maiti, Grad Student, Northeastern
- Tuesday, September 13, 2022, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- A Study of Neural Network Field Theories
- I will present a systematic exploration of field theories arising in Neural Networks, using a dual framework given by Neural Network parameters. The infinite width limit of NN architectures, combined with i.i.d. parameters, lead to Gaussian Processes in Neural Networks by the Central Limit Theorem (CLT), corresponding to generalized free field theories. Small and large violations of the CLT respectively lead to weakly coupled and non-perturbative non-Lagrangian field theories in Neural Networks. Non-Gaussianity, locality (via cluster decomposition), and symmetries can be specified by corresponding field theory terms. We identify scaling laws with parameters and identify ‘critical regimes’ in parameter space that mimic transitions from trivial to nontrivial theories in physics.
- Talk Slides (for IAIFI members only)

Spring 2022

Manami Kanemura, Undergraduate Student, Northeastern University (completed co-op with Bryan Ostdiek)
- Thursday, May 26, 2022, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Using Soft-Introspection to improve anomaly detection at LHC
- Resources: Soft-IntroVAE: Analyzing and Improving the Introspective Variational Autoencoder; Challenges for Unsupervised Anomaly Detection in Particle Physics
- Talk Slides (for IAIFI members only)
Mark Hamilton, Graduate Student, MIT
- Thursday, May 12, 2022, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Unsupervised Semantic Segmentation by Distilling Feature Correspondences
- Resources: Website; Paper; Code
- Talk Slides (for IAIFI members only)
Dylan Hadfield, Assistant Professor, MIT
- Thursday, May 5, 2022, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Overoptimization, Incompleteness, and Goodhart’s Law
- Resources: https://arxiv.org/abs/1611.08219; https://arxiv.org/abs/1705.09990; https://arxiv.org/abs/2102.03896
Benjamin Fuks, Professor, Sorbonne University
- Thursday, April 28, 2022, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Precision simulations for new physics
- Resources: Precision simulations for new physics (JHEP 12 (2019) 008); How precision allows us to design new variables to look for signals (Phys. Rev. D 100, 074010 (2019); Trying to do better with boosted decision trees on the basis of tree-level simulations (JHEP 04 (2022) 015)
Carolina Cuesta, PhD Student, Durham University & Incoming IAIFI Fellow
- Thursday, April 21, 2022, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Equivariant normalizing flows and their application to cosmology
- Resources: https://arxiv.org/abs/2202.05282; https://arxiv.org/abs/2105.09016
- Talk Slides (for IAIFI members only)
Anatoly Dymarsky, Associate Professor, University of Kentucky
- Thursday, April 14, 2022, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Tensor network to learn the wave function of data
- We use tensor network-based architecture to train a network which simultaneously accomplishes two tasks: image classification and image sampling. We argue that simultaneous performance of these tasks means our network has successfully learned the whole ‘manifold of data’ (using the terminology from the literature) - namely all possible images of a particular kind. We use a black and white version of MNIST, hence our network learns all possible images depicting a particular digit. We access global properties of the ‘manifold of data’ by calculating its size. Thus, we found there are 2^72 possible images of digit 3. We explain this number is robust and largely independent of the details of training process etc. Resources: Tensor network to learn the wavefunction of data
- Talk Slides (for IAIFI members only)
Yin Lin, Postdoctoral Researcher, MIT
- Thursday, April 7, 2022, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Accelerating Dirac equation solves in lattice QFT with neural-network preconditioners
- Resources: An Introduction to the Conjugate Gradient Method Without the Agonizing Pain; Iterative Methods for Sparse Linear Systems; Deep Learning of Preconditioners for Conjugate Gradient Solvers in Urban Water Related Problems; Learning to Optimize Non-Rigid Tracking
- Talk Slides (for IAIFI members only)
Denis Boyda, Postdoctoral Appointee, Argonne National Laboratory & Incoming IAIFI Fellow
- Thursday, March 17, 2022, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Overview of some popular Machine Learning frameworks for data parallelism
- Resources: PyTorch Distributed: Experiences on Accelerating Data Parallel Training; Horovod: fast and easy distributed deep learning in TensorFlow; ZeRO: Memory Optimizations Toward Training Trillion Parameter Models
- Talk Slides (for IAIFI members only)
Jessie Micallef, PhD Student, Michigan State University & Incoming IAIFI Fellow
- Thursday, March 10, 2022, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Adapting CNNs to Reconstruct Sparse, GeV-Scale IceCube Neutrino Events
- Resources: Reconstructing Neutrino Energy using CNNs for GeV Scale IceCube Events; Direction Reconstruction using a CNN for GeV-Scale Neutrinos in IceCube
- Talk Slides (for IAIFI members only)

Fall 2021

Murphy Niu, Research Scientist, Google Quantum AI
- Friday, December 3, 2021, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Entangling Quantum Generative Adversarial Networks using Tensorflow Quantum
- Resources: Entangling Quantum GANs; Related research
Eric Michaud, PhD Student, MIT
- Thursday, November 18, 2021, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Curious Properties of Neural Networks
- In this informal talk/discussion, I will highlight some facts about neural networks which I find to be particularly fun and surprising. Possible topics could include the Lottery Ticket Hypothesis, Double Descent, and ‘grokking’. There will be time for discussion and for attendees to bring up their own favorite surprising facts about deep learning. Resources: Lottery Ticket Hypothesis; Double Descent; Grokking
Ge Yang, Postdoctoral Fellow, IAIFI
- Thursday, October 21, 2021, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Learning and Generalization: Revisiting Neural Representations
- Understanding how deep neural networks learn and generalize has been a central pursuit of intelligence research. This is because we want to build agents that can learn quickly from a small amount of data, that also generalizes to a wider set of scenarios. In this talk, we take a systems approach by identifying key bottleneck components that limits learning and generalization. We will present two key results — overcoming the simplicity bias of neural value approximation via random Fourier features and going beyond the training distribution via invariance through inference.
Ziming Liu, Grad Student, MIT
- Thursday, October 7, 2021, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Dynamics in Modern Deep Learning Models
- Resources: Transient Chaos in BERT; Memory and attention in deep learning; The Brownian motion in the transformer mode
- Talk Slides (for IAIFI members only)
Michael Douglas, Researcher, Harvard CMSA
- Thursday, September 23, 2021, 11:00am–12:00pm, MIT LNS Conference Room (26-528)
- Solving Combinatorial Problems using AI/ML
- Resources: Bright et al 1907.04408](https://arxiv.org/abs/1907.04408); Heule et al 1905.10192; Halverson et al 1903.11616; McAleer et al 1805.07470; Gukov et al 2010.16263; General sources on reinforcement learning;The MathCheck SAT+CAS system
- Talk Slides (for IAIFI members only)
Dan Roberts, Research Affiliate, MIT
- Wednesday, December 2, 2020, 12:00am–12:00am, MIT LNS Conference Room (26-528)
- Effective Theory of Deep Learning
- Resource: The Principles of Deep Learning Theory
Ziming Liu, Grad Student, MIT
- Wednesday, November 18, 2020, 12:00am–12:00am, MIT LNS Conference Room (26-528)
- Scaling Laws of Learning
- Resources: Scaling Laws of Learning 1; Scaling Laws of Learning 2; Scaling Laws of Learning 3
Andrew Tan, Grad Student, MIT
- Wednesday, November 4, 2020, 12:00am–12:00am, MIT LNS Conference Room (26-528)
- Estimating Mutual Information
- Resources: Estimating Mutual Information
Bhairav Mehta, Grad Student, MIT
- Tuesday, October 20, 2020, 12:00am–12:00am, MIT LNS Conference Room (26-528)
- Learning Invariances
- Resources: Learning Invariances

Spring 2021

Siddharth Mishra-Sharma, Postdoctoral Fellow, IAIFI
- Tuesday, May 11, 2021, 12:00am–12:00am, MIT LNS Conference Room (26-528)
- Simulation-Based Inference Focusing on Astrophysical Applications
- Resources: Simulation-Based Inference; Astrophysical Applications
Anna Golubeva, Postdoctoral Fellow, IAIFI
- Tuesday, April 27, 2021, 12:00am–12:00am, MIT LNS Conference Room (26-528)
- Are Wider Nets Better Given the Same Number of Parameters?
- Resources: Are Wider Nets Better Given the Same Number of Parameters?
Di Luo, Postdoctoral Fellow, IAIFI
- Tuesday, April 6, 2021, 12:00am–12:00am, MIT LNS Conference Room (26-528)
- Simulating Quantum Many-Body Physics with Neural Network Representation
- Resources: Simulating Quantum Many-Body Physics; Related research 1; Related research 2
Jacob Zavatone-Veth, Grad Student, Harvard
- Tuesday, March 2, 2021, 12:00am–12:00am, MIT LNS Conference Room (26-528)
- Non-Gaussian Processes and Neural Networks at Finite Widths
- Resources: Non-Gaussian Processes and Neural Networks at Finite Widths
Anindita Maiti, Grad Student, Northeastern
- Wednesday, February 17, 2021, 12:00am–12:00am, MIT LNS Conference Room (26-528)
- Neural Networks and Quantum Field Theory
- Resources: Neural Networks and Quantum Field Theory