Stay tuned for updates about our 2025 IAIFI Summer Workshop. Join our mailing list to receive updates.

The IAIFI Summer Workshop brings together researchers from across Physics and AI for plenary talks, poster sessions, and networking to promote research at the intersection of Physics and AI.

Many of the videos from the 2024 IAIFI Summer Workshop are now posted on the IAIFI YouTube channel.

Many of the speakers’ slides from the 2024 IAIFI Summer Workshop are now available online.

  • The 2024 Summer Workshop was held August 12–16, 2024
  • Location: Bartos Theater, MIT List Visual Arts Center, Lower Level (20 Ames Street, Cambridge)
  • Registration deadline: July 31, 2024

Here’s what attendees at previous IAIFI Summer Workshops had to say about the experience:

Videos of the plenary talks from the 2023 IAIFI Summer Workshop are now available on YouTube.

Agenda Speakers FAQ Past Workshops Accommodations

About

The Institute for Artificial Intelligence and Fundamental Interactions (IAIFI) is enabling physics discoveries and advancing foundational AI through the development of novel AI approaches that incorporate first principles, best practices, and domain knowledge from fundamental physics. The goal of the Workshop is to serve as a meeting place to facilitate advances and connections across this growing interdisciplinary field.

View recommendations for meals and activities around MIT

Agenda

Monday, August 12, 2024

9:15-9:30 am ET

Welcome

9:30–10:15 am ET

10,000 Einsteins: AI and the future of theoretical physics

Matt Schwartz, Harvard/IAIFI

Abstract AI has already proved revolutionary in many areas of physics, particularly those focused on data analysis. However, machines are also advancing rapidly in symbolic tasks. As much of what is done in theoretical physics is symbolic, there is tremendous potential for machines to transition from data analysis to formal theoretical work. This talk will discuss some initial progress in this direction and a vision for how machines and humans might collaborate in the future to solve some of the most challenging problems in fundamental physics.

10:15–11:00 am ET

Dynamic Models from Data

Nathan Kutz, University of Washington

Abstract Physics based models and governing equations dominate science and engineering practice. The advent of scientific computing has transformed every discipline as complex, high-dimensional and nonlinear systems could be easily simulated using numerical integration schemes whose accuracy and stability could be controlled. With the advent of machine learning, a new paradigm has emerged in computing whereby we can build models directly from data. In this work, integration strategies for leveraging the advantages of both traditional scientific computing and emerging machine learning techniques are discussed. Using domain knowledge and physics-informed principles, new paradigms are available to aid in engineering understanding, design and control.

11:00-11:30 am ET

Break

11:30 am–12:15 pm ET

Accurate, efficient, and reliable learning of deep neural operators for multiphysics and multiscale problems

Lu Lu, Yale University

Abstract It is widely known that neural networks (NNs) are universal approximators of functions. However, a less known but powerful result is that a NN can accurately approximate any nonlinear operator. This universal approximation theorem of operators is suggestive of the potential of deep neural networks (DNNs) in learning operators of complex systems. In this talk, I will present the deep operator network (DeepONet) to learn various operators that represent deterministic and stochastic differential equations. I will also present several extensions of DeepONet, such as DeepM&Mnet for multiphysics problems, DeepONet with proper orthogonal decomposition or Fourier decoder layers, MIONet for multiple-input operators, and multifidelity DeepONet. I will demonstrate the effectiveness of DeepONet and its extensions to diverse multiphysics and multiscale problems, such as bubble growth dynamics, high-speed boundary layers, electroconvection, hypersonics, geological carbon sequestration, and full waveform inversion. Deep learning models are usually limited to interpolation scenarios, and I will quantify the extrapolation complexity and develop a complete workflow to address the challenge of extrapolation for deep neural operators.

12:15–1:30 pm ET

Lunch

1:30–3:00 pm ET

Contributed Talks Session A - Representation/Manifold Learning

Bartos Theater

Symmetries and neural tangent kernels: using physical principles to understand deep learning, Jan Gerken (Chalmers University of Technology) Despite its extraordinary success in applications, a thorough theoretical understanding of deep learning is still lacking, making progress depend largely on costly trial-and-error procedures. At the same time, theoretical physics has a long history of developing deep mathematical understanding of complex systems. In this talk, I will present some recent work on how techniques from theoretical physics can be used to deepen our understanding of deep learning and lead to practically relevant insights. In particular, symmetries, which are an established cornerstone of theoretical physics, have reached widespread popularity as a guiding principle in deep learning as well. In machine learning, symmetries feature most importantly in the form of data augmentation and equivariant neural networks. At the same time, neural tangent kernels, which are closely related to statistical field theory, have emerged as a powerful tool to understand neural networks both at initialization and during training. Combining these paradigms leads to practically relevant statements in deep learning. Furthermore, it opens the door towards further deepening the connecting between theoretical physics and our understanding of neural networks.
Approximately-symmetric neural networks for quantum spin liquids, Dominik Kufel (Harvard University) We propose and analyze a family of approximately-symmetric neural networks for quantum spin liquid problems. These tailored architectures are parameter-efficient, scalable, and significantly out-perform existing symmetry-unaware neural network architectures. Utilizing the mixed-field toric code model, we demonstrate that our approach is competitive with the state-of-the-art tensor network and quantum Monte Carlo methods. Moreover, at the largest system sizes (N=480), our method allows us to explore Hamiltonians with sign problems beyond the reach of both quantum Monte Carlo and finite-size matrix-product states. The network comprises an exactly symmetric block following a non-symmetric block, which we argue learns a transformation of the ground state analogous to quasiadiabatic continuation. Our work paves the way toward investigating quantum spin liquid problems within interpretable neural network architectures.
Title to come, Simonetta Liuti (The University of Virginia) Abstract to come
A Neural Net Model for Distillation with Weights Explained, Berfin Simsek (NYU/Flatiron Institute) It is important to understand how large models represent knowledge to make them efficient and safe. We study a toy model of neural nets that exhibits non-linear dynamics and phase transition. Although the model is complex, it allows finding a family of the so-called "copy-average" critical points of the loss. The gradient flow initialized with random weights consistently converges to one such critical point for networks up to a certain width, which we proved to be optimal among all copy-average points. Moreover, we can explain every neuron of a trained neural network of any width. As the width grows, the network changes the compression strategy and exhibits a phase transition. We close by listing open questions calling for further mathematical analysis and extensions of the model considered here.

Physics-Motivated Optimization

Beyond Closure Models: Estimating Long-term Statistics of Chaotic-Systems via Physics-Informed Neural Operators, Chuwei Wang (Caltech) Accurately predicting the long-term behavior of chaotic systems is important in many applications. This requires iterative computations on a dense spatiotemporal grid to account for the unstable nature of chaotic systems, which is expensive and impractical in many real-world scenarios. The alternative approach to such a full-resolved simulation is using a coarse grid and then correcting its errors through a 'closure model', which approximates the overall information from fine scales not captured in the coarse-grid simulation. Recently, ML approaches have been used for closure modeling, but they typically require a large number of training samples from expensive fully-resolved simulations (FRS). In this work, through the lens of Liouville flow in function spaces, we prove an even more fundamental limitation, viz., the standard approach to learning closure models suffers from a large approximation error for generic problems, no matter how large the model is, and it stems from the non-uniqueness of the mapping. We propose an alternative end-to-end learning approach using a physics-informed neural operator (PINO) that overcomes this limitation by not using a closure model or a coarse-grid solver. We first train the PINO model on data from a coarse-grid solver and then fine-tune it with (a small amount of) FRS and physics-based losses on a fine grid. The discretization-free nature of neural operators means that they do not suffer from the restriction of a coarse grid that closure models face, and they can provably approximate the long-term statistics of chaotic systems. In our experiments on fluid dynamics, our PINO model achieves a 120x speedup compared to FRS with a relative error ~5%. In contrast, the closure model coupled with a coarse-grid solver is 58x slower than PINO while having a much higher error 205% when the closure model is trained on the same FRS dataset.
Determining Heterogeneous Elastic Properties of Soft Materials using Physics-Informed Neural Networks, Wensi Wu (Children's Hospital of Philadelphia) The heterogeneous mechanical properties found in biological materials have profound implications for both engineering and medical applications. Within the engineering community, these properties are frequently studied to guide the design of mechanical devices such as artificial organs and soft robots. Concurrently, in the medical field, the mechanical properties of tissues play a crucial role in providing diagnostic information about various diseases and conditions. The significance of material mechanical properties across these diverse domains has driven a need to better understand the underlying mechanisms governing the microscopic properties of biological tissues and their associated functions, whether for improving material designs or disease diagnosis. In traditional engineering, identifying unknown material parameters requires iterative inverse finite element analyses and optimization of the constitutive parameters until the finite element model achieves an acceptable level of mechanical response, aligning with experimental data. While this method is efficient with homogeneous materials, optimizing the elasticity map of heterogeneous materials is challenging. In this work, we propose using physics-informed neural networks (PINNs) to identify the full-field elastic properties of highly nonlinear, hyperelastic materials. We applied our improved PINNs to six structurally complex materials and three constitutive material models (Neo-Hookean, Mooney-Rivlin, and Gent) to evaluate the accuracy of full-field elasticity maps estimated by PINNs. Our PINN model consistently produced highly accurate estimates of the full-field elastic properties, even when there was up to 10% noise present in the training data.

Contributed Talks Session B - Generative Models

MIT Media Lab, Room 633

Machine learning phase transitions: A probabilistic perspective, Julian Arnold (University of Basel) The identification of phase transitions and the classification of different phases of matter from data are among the most popular applications of machine learning in physics. Neural network (NN)-based approaches have proven to be particularly powerful due to the ability of NNs to learn arbitrary functions. Many such approaches work by computing indicators of phase transitions from the output of NNs trained to solve specific classification problems. In this talk, I will derive the optimal solutions to these classification problems given by Bayes classifiers that take into account the probability distributions underlying the physical system under consideration [1]. This probabilistic viewpoint allows us to gain a deeper understanding of previous NN-based studies, highlighting the strengths and weaknesses of individual methods [1], enables us to root the methods in information theory [2], yields more efficient numerical routines based on the incorporation of readily available generative models [3], and widens the application domain of these methods to systems outside physics (such as diffusion models or transformers) [4,5]. [1] J. Arnold and F. Schäfer, PRX 12, 031044 (2022) [2] J. Arnold et al., arXiv:2311.10710 (2023) [3] J. Arnold et al., PRL 132, 207301 (2024) [4] J. Arnold et al., arXiv:2311.09128 (2023) [5] J. Arnold et al., arXiv:2405.17088 (2024)
Accelerating Molecular Discovery with Machine Learning, Yuanqi Du (Cornell University) Recent advancements in machine learning have paved the way for groundbreaking opportunities in the realm of molecular discovery. At the forefront of this evolution are improved computational tools with proper inductive biases and efficient optimization. In this talk, I will delve into our efforts around these themes from a geometry, sampling and optimization perspective. I will first introduce how to encode symmetries in the design of neural networks and the balance of expressiveness and computational efficiency. Next, I will discuss how generative models enable a wide range of design and optimization tasks in molecular discovery. In the third part, I will talk about how the advancements in stochastic optimal control, sampling and optimal transport can be applied to find transition states in chemical reactions.
Understanding Diffusion Models by Feynman's Path Integral, Yuji Hirono (Osaka University) Score-based diffusion models have proven effective in image generation and have gained widespread usage. We introduce a novel formulation of diffusion models using Feynman's path integral [1]. We find this formulation providing comprehensive descriptions of score-based generative models, and demonstrate the derivation of backward stochastic differential equations and loss functions.The formulation accommodates an interpolating parameter connecting stochastic and deterministic sampling schemes, and we identify this parameter as a counterpart of Planck's constant in quantum physics. This analogy enables us to apply the Wentzel-Kramers-Brillouin (WKB) expansion, a well-established technique in quantum physics, for evaluating the negative log-likelihood to assess the performance disparity between stochastic and deterministic sampling schemes. Reference: [1] Yuji Hirono, Akinori Tanaka, Kenji Fukushima, accepted in ICML2024 [arXiv:2403.11262].
Neural Entropy, Akhil Premkumar (University of Chicago) What is the smallest neural network that can do a particular task? To answer this question we need to understand the capacity of neural networks to encode and store information. In the context of generative diffusion models, we show that it is possible to identify the entropy of the network, which characterizes precisely its storage capacity.
Predicting Missing Regions in Charged Particle Tracks Using a Sparse 3D Convolutional Neural Network, Hilary Utaegbulam (University of Rochester) The 2x2 Demonstrator is a prototype of ND-LAr, the liquid argon time-projection chamber of the Deep Underground Neutrino Experiment’s Near Detector complex. Both the 2x2 Demonstrator and ND-LAr are modular detectors that will have pixelated charge readouts and inactive regions wherein there is no sensitivity to charge deposition and light signals that arise from charged particle interactions with liquid argon. In the 2x2, these inactive regions are located in between the active detector modules, which introduces the challenge of inferring what charge signals ought to look like in these regions. This study explores the use of a Sparse 3D Convolutional Neural Network (ConvNet) to infer missing regions in charged particle tracks. Hits corresponding to energy depositions are voxelized into a three-dimensional grid for each track. Voxels that fall into predefined inactive regions are removed to simulate the lack of detector output. The model is trained to infer the topology of the missing track voxels, with the ultimate goal of inferring the missing charge or energy values in these voxels as well. Results indicate that this approach shows promise in prediction of missing track regions with some accuracy.

3:00–3:30 pm ET

Break

3:30–4:15 pm ET

What Do Language Models Have To Say About Fundamental Physics?

Mariel Pettee, LBNL/Flatiron

Abstract The launch of ChatGPT in November 2022 ignited an ongoing worldwide conversation about the possible impacts of Large Language Models (LLMs) on the way we work. As scientists, however, the changes in our workflows since the advent of this technology have been relatively minor. Will this still be the case in 10 years? Could an analogous paradigm shift arise from a foundation model trained on a large amount of scientific data, transforming the way we conduct our research? If so, what can we learn from the development of other foundation models, particularly LLMs, in their evolution from specialists to (quasi-)generalists? In this talk, I will present some recent work exploring how language models could help form a foundation model of fundamental physics. I'll also share my perspective on how we should strive to shape such models to reflect our highest priorities as scientists.

4:15-5:00 pm ET

Solving the nuclear many-body problem with neural quantum state

Alessandro Lovato, Argonne National Laboratory

Abstract Artificial neural networks can be employed to accurately and compactly represent quantum many-body states relevant to many applications, including nuclear physics, quantum chemistry, and condensed matter problems. I will argue that a variational Monte Carlo algorithm based on neural-network quantum states provides a systematically improvable solution to the nuclear Schrödinger equation with a polynomial cost in the number of nucleons. After presenting recent progress in describing atomic nuclei, neutron-star matter, and hypernuclei, I will illustrate an application to condensed-matter systems, specifically ultra-cold Fermi gases near the unitary limit. Detailed benchmarks with continuum Quantum Monte Carlo methods will be presented.

5:00–7:00 pm ET

Poster Session

MIT Media Lab, 6th Floor

Details
  • Data Compression and Inference in Cosmology with Self-Supervised Machine Learning, Aizhan Akhmetzhanova (Harvard University)
  • CNN and Transformer architecture for jets events classification, Juvenal Bassa (University of Puerto Rico - Mayaguez)
  • Data-Driven Discovery of X-ray Transients with Machine Learning, Steven Dillmann (University of Cambridge)
  • Sampling Transition Dynamics with Machine Learning Approaches, Yuanqi Du (Cornell University)
  • Multi-Modal Generalized Class Discovery for Scalable Autonomous All-Sky Surveys, Sriram Elango (Harvard University)
  • Inverse Design of Complex Fluids with Fully-Differentiable Lagrangian Particle Dynamics, Kaylie Hausknecht (Harvard University and MIT)
  • Perfect Jet Classification Through Equivariant Regression, Timothy Hoffman (University of Chicago)
  • Flow-Based Generative Emulation of Grids of Stellar Evolutionary Models, Marc Hon (MIT Kavli Institute for Astrophysics and Space Research)
  • Enhancing Cosmological Simulations with Efficient and Interpretable Machine Learning in the Wavelet Basis, Cooper Jacobus (UC Berkeley: Dept. Astrophysics, Lawrence Berkeley National Lab: Computational Cosmology Center)
  • Training neural operators to preserve invariant measures of chaotic attractors, Ruoxi Jiang (University of Chicago)
  • Hidden Giants: Redefining QSO Classification and Outlier Detection with Redshift Invariant Autoencoders, Thaddaeus Kiker (Columbia University)
  • KAN: Kolmogorov-Arnold Networks, Ziming Liu (MIT, IAIFI)
  • Phase Transitions in the Output Distribution of Large Language Models, Niels Loerch (University of Basel)
  • Tackling reasoning problems with AI, Rishabh Mallik (Forschungszentrum Jülich)
  • Recurrent Features of Amplitudes in Planar N = 4 Super Yang-Mills Theory, Garrett Merz (University of Wisconsin-Madison)
  • Ultrafast Jet Classification using Geometric Learning, Patrick Odagiu (ETH Zurich)
  • Deep Stochastic Mechanics, Elena Orlova (The University of Chicago)
  • Differentiable and Distributional Cosmological Stasis, Sneh Pandya (Northeastern / IAIFI)
  • Exploring Astronomical Catalog Crossmatching with Machine Learning, Victor Samuel Perez Diaz (Center for Astrophysics | Harvard & Smithsonian, IAIFI)
  • Towards an AI-enabled astronomy system: natural language processing of Chandra data archive, Shivam Raval (Harvard University)
  • Auto-decoding Poisson Processes for Unsupervised X-ray Sources Learning, Yanke Song (Harvard University, Department of Statistics)
  • Development of photothermal techniques for the detection of cancer biomarkers, Ilhem Soyah (Higher school of sciences and technology of Hammam Sousse)
  • Multi-Modal Contrastive Training for Robust VQA, Mitra Tajrobehkar (Vertical Oceans)
  • Zero-Shot Classification of Astronomical Images with Large Multimodal Models, Dimitrios Tanoglidis (University of Pennsylvania)
  • Vertex finding and jet class classification using Wasserstein Neural Network, Diego F. Vasquez Plaza (Univesity Puerto rico Mayagüez)
  • Learning Group Invariant CY Metrics by Fundamental Domain Projections, Moritz Walden (Uppsala University)
  • Accelerating Energy Computation in Many-electron Systems with Forward Laplacian, Chuwei Wang (Caltech)
  • Emulating the Effects of Pile-Up on X-ray Spectra, Justina Yang (Harvard University)
  • A Variational Continuation Method for Periodic Orbits Using Autograd and Hessian Eigendecompositions, Leo Yao (MIT)
  • HyperTagging: Reconstruction of Full Decays using Transformers and Hyperbolic Embedding, Boyang Yu (LMU Munich, Germany)
  • Neural scaling laws from large-N field theory, Zhengkang Zhang (University of Utah)
  • Revealing the 3D Cosmic Web with Physics Constrained Neural Fields, Brandon Zhao (Caltech)

6:00–8:00 pm ET

Welcome Reception

MIT Media Lab, 6th Floor

Tuesday, August 13, 2024

9:30–10:15 am ET

Trends in AI for particle accelerators

Verena Kain, CERN

Abstract AI is without doubt radically transforming science with many successful applications in molecular biology, astrophysics, nuclear physics and particle physics. It has enabled significant technological advances for robotics that can particularly enhance a system’s perception, navigational and manipulation abilities and interaction. For control, it enables novel and faster learning/teaching of tasks, replacing or augmenting classical control techniques for hard problems such as real-time control of the non-linear dynamics of the plasma in a tokamak of a fusion reactor, or navigating drones with super-human performance. Given the success and types of use cases that can be solved with AI algorithms, accelerator physics and associated technologies have also picked up on AI in the last 5 to 10 years with the number of ML applications steadily rising - and subsequently the number of ML related papers at the big particle accelerator conferences. This contribution will give a brief overview of the typical use cases for AI for particle accelerators, show recent trends and describe the potential and vision of AI for particle accelerators with the emphasis on control and optimisation of particle accelerators.

10:15–11:00 am ET

An introduction to neural ODEs in scientific machine learning

Patrick Kidger, Cradle.bio

Abstract This is an introduction to neural ODEs for scientific applications. The goal is to (a) provide a modelling tool that enhances the expressivity of existing theory-driven approaches, (b) demonstrate that neural ODEs are easy to use via modern autodifferentiable software, and (c) give enough of the tips-and-tricks needed to make neural ODEs work in practice!

11:00-11:30 am ET

Break

11:30 am–12:15 pm ET

Automatic Symmetry Discovery from Data

Rose Yu, UCSD

Abstract Despite the success of equivariant neural networks in scientific applications, they require knowing the symmetry group a priori. However, it may be difficult to know which symmetry to use as an inductive bias in practice. Enforcing the wrong symmetry could even hurt the performance. In this talk, I will discuss our effort in developing a deep learning framework that can automatically discover symmetry from data. Our framework, LieGAN, represents symmetry as interpretable Lie algebra basis and uses a paradigm akin to generative adversarial training. We further generalized it LaLieGAN to discover non-linear symmetries from high-dimensional data. Empirically, the learned symmetry can also be readily used in existing equivariant neural networks to improve accuracy and generalization in prediction. It can also improve equation discovery and long-term forecasting for various dynamical systems.

12:15–1:30 pm ET

Lunch

1:30–3:00 pm ET

Contributed Talks - Session A - Foundational ML

Bartos Theater

Diversity with Similarity as a Measure of Dataset Quality, Josiah Couch (Beth Israel Deaconess Medical Center) Dataset size and class balance are important measures in deep learning. Maximizing them is seen as a way to ensure that datasets contain diverse images, which models are thought to need in order to generalize well. Yet neither size nor class balance measure image diversity directly, raising the possibility that better measures of dataset quality might exist. To test this hypothesis, we turned to a comprehensive framework of diversity measures that generalizes familiar quantities like Shannon entropy by accounting for the similarities and differences among images. (Size and class balance emerge from this framework as special cases.) We created several thousand diverse datasets by subsampling a variety of large medical-image datasets representing a range of imaging modalities, trained classifiers on these subsets, and calculated the correlation between subset diversity and model accuracy using diversity measures from the framework.
RG flow of the NTK dynamics at finite-width from Feynman diagrams, Max Guillen (Chalmers University of Technology) Deep Learning is nowadays a well-stablished method for different applications in science and technology. However, it has been unclear for a long time how the "learning process" actually occurs in different architectures, and how this knowledge could be used to optimize performance and efficiency. Recently, high-energy-physics-based ideas have been applied to the modelling of Deep Learning, thus translating the learning problem to an RG flow analysis in Quantum Field Theory (QFT). In this talk, I will explain how these quite complicated formulae describing such RG flows for different observables in neural networks at initialization, can be easily obtained from a few rules resembling Feynman rules in QFT. I will also comment on some work in progress which implements such rules for computing higher-order corrections to the frozen (infinite-width) NTK for particular activation functions, and how they evolve after a few steps of SGD.
Supervised learning of infinitely-overparameterized DNNs through the lens of Wilsonian RG, Anindita Maiti (Perimeter Institute) The key to the performance of ML algorithms is an ability to segregate relevant features in input datasets from the irrelevant ones. In a setup where data features play the role of an energy scale, we develop a Wilsonian RG framework to integrate out unlearnable modes associated with the Neural Network Gaussian Process (NNGP) kernel, in the regression context. Such a framework in the case of Gaussian features leads to a universal flow of the ridge parameter, whereas, non-Gaussianities in data features result in rich input-dependent RG flows. This framework goes beyond the usual analogies between RG flows and learning dynamics, and offers potential improvements to our understanding of feature learning and universality classes of models.
Input Space Mode Connectivity in Deep Neural Networks, Jakub Vrabel (CEITEC, Brno University of Technology) We extend the concept of loss landscape mode connectivity to the input space of deep neural networks. Mode connectivity was originally studied within parameter space, where it describes the existence of low-loss paths between different solutions (loss minimizers) obtained through gradient descent. We present theoretical and empirical evidence of its presence in the input space of deep networks, thereby highlighting the broader nature of the phenomenon. We observe that different input images with similar predictions are generally connected, and for trained models, the path tends to be simple, with only a small deviation from being a linear path. Our methodology utilizes real, interpolated, and synthetic inputs created using the input optimization technique for feature visualization. To prove the existence of general mode connectivity in high-dimensional input spaces, we employ percolation theory. We argue that the approximate linear mode connectivity post-training is a manifestation of some implicit bias. We exploit mode connectivity to obtain new insights about adversarial examples and demonstrate its potential for adversarial detection. Additionally, we discuss applications for the interpretability of deep networks.
Neural scaling laws from large-N field theory, Zhengkang Zhang (University of Utah) Many machine learning models based on neural networks exhibit scaling laws: their performance scales as power laws with respect to the sizes of the model and training data set. We use large-N field theory methods to solve a model recently proposed by Maloney, Roberts and Sully which provides a simplified setting to study neural scaling laws. Our solution extends the result in this latter paper to general nonzero values of the ridge parameter, which are essential to regularize the behavior of the model. In addition to obtaining new and more precise scaling laws, we also uncover a duality transformation at the diagrams level which explains the symmetry between model and training data set sizes. The same duality underlies recent efforts to design neural networks to simulate quantum field theories.
Fourier-enhanced deep operator network for geophysics with improved accuracy, efficiency, and generalizability, Min Zhu (Yale University) Full waveform inversion (FWI) and geologic carbon sequestration (GCS) are two significant topics in geophysics. FWI infers subsurface structure information from seismic waveform data by solving a non-convex optimization problem. On the other hand, solving multiphase flow in porous media is essential for CO2 migration and pressure fields in the subsurface associated with GCS. However, numerical simulations for both FWI and GCS are computationally challenging and expensive due to the highly nonlinear governing partial differential equations (PDEs). Here, we develop a Fourier-enhanced deep operator network (Fourier-DeepONet) to address this issue. For FWI, compared with existing data-driven FWI methods, Fourier-DeepONet achieves more accurate predictions of subsurface structures across a wide range of source parameters. Additionally, Fourier-DeepONet demonstrates superior robustness when handling data with Gaussian noise or missing traces. For GCS, compared to the state-of-the-art Fourier neural operator (FNO), Fourier-DeepONet offers superior computational efficiency, with 90% fewer unknown parameters, significantly reduced training time (approximately 3.5 times faster), and much lower GPU memory requirements (less than 35%). Furthermore, Fourier-DeepONet maintains good accuracy when predicting out-of-distribution (OOD) data. This excellent generalizability is enabled by its adherence to the physical principle that the solution to a PDE is continuous over time.

Contributed Talks Session B - Physics-Motivated Optimization

MIT Media Lab, Room 633

Search for new physics using Event-based anomaly detection at the ATLAS detector of CERN and development of ADFilter tool, Wasikul Islam (University of Wisconsin-Madison) Searches for new resonances in two-body invariant mass distributions are performed using an unsupervised anomaly detection technique in events produced in proton-proton collisions at a center of mass energy of 13 TeV recorded by the ATLAS detector at the LHC. Studies are conducted in data containing at least one isolated lepton. An autoencoder network is trained with 1% randomly selected collision events and anomalous regions are then defined which contain events with high reconstruction losses from the decoder. Nine invariant mass distributions are inspected which contain pairs of one light jet (or one b-jet) and one lepton, photon, or a second light jet (b-jet). The 95% confidence level upper limits on contributions from generic Gaussian signals are reported for the studied invariant mass distributions. The obtained model-independent limits show strong potential to exclude generic heavy states with complex decays.
Marginalize, Don't Subtract: Spectral Component Separation for Faint Objects in DESI, Ana Sofia Uzsoy (Harvard University) Component separation is a critical step in disentangling multiple signals and in extracting useful information from spectra. In this talk, I present MADGICS (Marginalized Analytic Dataspace Gaussian Inference for Component Separation), a data-driven Bayesian component separation technique that can separate a spectrum into any number of Gaussian-distributed components. I then discuss the application of this technique for automatically determining redshifts for Lyman Alpha Emitter (LAE) galaxies observed with DESI while marginalizing over sky residuals to separate sky from target emission lines. We create a covariance matrix from visually inspected DESI LAE targets to provide physically motivated priors, and determine redshift by jointly inferring sky, LAE, and residual components for each individual spectrum. This component separation technique will allow us to create a high-quality catalog of LAE spectra and redshifts from DESI data and is also broadly generalizable to other spectral features of interest.
A Variational Continuation Method for Periodic Orbits Using Autograd and Hessian Eigendecompositions, Leo Yao (MIT) We present a Hessian-based approach to numerically continue periodic orbits. Our method offers precise initializations of oscillations around unstable fixed points, an integrator-free variational continuation method, and efficient detection of orbit family intersections and subharmonic bifurcations. Leveraging autograd for computations, we present full continuations of periodic double pendulum oscillations from fixed points and examples of detected bifurcations along these orbit families.
Revealing the 3D Cosmic Web with Physics Constrained Neural Fields, Brandon Zhao (Caltech) Weak gravitational lensing is the slight distortion of galaxy shapes by the gravitational effect of the large-scale structure. In our work, we seek to invert the weak lensing signal found in 2D telescope images to obtain a 3D reconstruction of the universe’s dark matter field. While typically this inversion is done in 2D to obtain a projection of the dark matter field, accurate 3D maps of the dark matter distribution are particularly useful as they allow us to detect and localize structures of interest such as galaxy clusters, as well as disambiguate them from intervening matter along the line of sight. This inversion is ill-posed for several reasons. First, images are only observed from a single viewing angle, which must be inverted into a 3D mass distribution. Second, the exact locations and shapes of unlensed galaxies is in general unknown, and can only be estimated with a degree of uncertainty. This introduces a large amount of noise to our measurement of the lensing signal. We propose a novel methodology using a physics-constrained, coordinate-based neural field to model the underlying continuous matter distribution. We take an analysis-by-synthesis approach, optimizing the weights of the neural network through a fully differentiable physical forward model to reproduce the lensing signal present in image measurements. We showcase reconstruction results on simulated measurements of dark matter distributions from a low resolution N-Body particle simulation, and compare our approach with earlier 3D inversion methods.

3:00–3:30 pm ET

Break

3:30–4:15 pm ET

KAN: Kolmogorov-Arnold Networks

Ziming Liu, MIT/IAIFI

Abstract Inspired by the Kolmogorov-Arnold representation theorem, we propose Kolmogorov-Arnold Networks (KANs) as promising alternatives to Multi-Layer Perceptrons (MLPs). While MLPs have fixed activation functions on nodes ("neurons"), KANs have learnable activation functions on edges ("weights"). KANs have no linear weights at all -- every weight parameter is replaced by a univariate function parametrized as a spline. We show that this seemingly simple change makes KANs outperform MLPs in terms of accuracy and interpretability. For accuracy, much smaller KANs can achieve comparable or better accuracy than much larger MLPs in data fitting and PDE solving. Theoretically and empirically, KANs possess faster neural scaling laws than MLPs. For interpretability, KANs can be intuitively visualized and can easily interact with human users. Through two examples in mathematics and physics, KANs are shown to be useful collaborators helping scientists (re)discover mathematical and physical laws. In summary, KANs are promising alternatives for MLPs, opening opportunities for further improving today's deep learning models which rely heavily on MLPs.

4:15-5:00 pm ET

A Pathway to Robotic Intelligence

Pulkit Agrawal, MIT/IAIFI

Abstract Details to come

Wednesday, August 14, 2024

9:30–10:15 am ET

Navigating Complex Models: Neural Networks for High-Dimensional Statistical Inference

Christoph Weniger, University of Amsterdam

Abstract Details to come

10:15–11:00 am ET

Data-Driven High-Dimensional Inverse Problems: A Journey Through Strong Lensing Data Analysis

Laurence Levasseur, University of Montreal

Abstract Details to come

11:00-11:30 am ET

Break

11:30 am–12:15 pm ET

Machine Learning and Physics: The Alliance of the Titans

Ayan Paul, Northeastern

Abstract Leaps in our understanding of Physics have been concomitant with the adoption of new and increasingly powerful mathematical structures that shift our perspective of how we probe the dynamics of the universe and allow us to unravel complex concepts that were hitherto inaccessible to us. In the realm of data-driven science, where physics is firmly planted, machine learning is proving to be a long-awaited and much-needed mathematical structure that has showcased its worth in aiding landmark discoveries, understanding the underlying symmetries of theories that we propose, and connecting signals to kinematics interpretably, to mention a few. In this parable on the charm of machine learning in physics, we will discuss the nuances of some of these achievements and lay out what we can expect from the future.

12:15–1:30 pm ET

Lunch

1:30–2:15 pm ET

Geometric Machine Learning

Melanie Weber, Harvard University

Abstract A recent surge of interest in exploiting geometric structure in data and models in machine learning has motivated the design of a range of geometric algorithms and architectures. This lecture will give an overview of this emerging research area and its mathematical foundation. We will cover topics at the intersection of Geometry and Machine Learning, including relevant tools from differential geometry and group theory, geometric representation learning, graph machine learning, and geometric deep learning.

2:15–3:00 pm ET

Machine Learning for LHC Theory

Tilman Plehn, Heidelberg

Abstract Details to come

3:00–3:30 pm ET

Break

3:30–4:15 pm ET

Asteroseismic probes of far-ranging astrophysics with big data and machine learning

Earl Bellinger, Yale University

Abstract Space telescopes like the NASA Kepler and TESS missions as well as the forthcoming PLATO mission are driving a data revolution in stellar astrophysics. The ultra-precise observations provided by these missions are challenging our best models of how stars evolve, and are in turn granting insights into the formation and evolution of planetary systems and the Galaxy as a whole. They furthermore present novel opportunities to probe far-ranging physics, such as dark matter and theories of gravity beyond general relativity. In this talk, I will give an overview of the data, models, challenges, and opportunities in asteroseismology, and highlight the role that machine learning is playing in advancing our knowledge across astrophysics.

4:15-5:00 pm ET

Big data cosmology meets AI

Carol Cuesta-Lazaro, IAIFI Fellow

Abstract The upcoming era of cosmological surveys promises an unprecedented wealth of observational data that will transform our understanding of the universe. Surveys such as DESI, Euclid, and the Vera C. Rubin Observatory will provide extremely detailed maps of billions of galaxies out to high redshifts. Analyzing these massive datasets poses exciting challenges that machine learning is uniquely poised to help overcome. In this talk, I will highlight recent examples from my work on probabilistic machine learning for cosmology. First, I will explain how a point cloud diffusion model can be used both as a generative model for 3D maps of galaxy clustering and as a likelihood model for such datasets. Moreover, I will present a generative model developed to reconstruct the initial conditions of the Universe from spectroscopic survey observations. When combined with the wealth of data from upcoming surveys, these machine learning techniques have the potential to provide new insights into fundamental questions about the nature of the universe.

5:30-6:30 pm ET

Panel on Industry–Academia Collaboration

  • Moderator: Carol Cuesta-Lazaro, IAIFI Fellow

  • Bill Freeman, Professor of EECS, MIT

  • Marin Soljacic, Professor of Physics, MIT

  • Partha Saha, Distinguished Engineer, Data and AI Platform, Visa

  • Nima Dehmamy, Research Assistant Professor, IBM Research MIT-IBM Lab

Thursday, August 15, 2024

9:30–10:15 am ET

Uncertainty Quantification from Neural Network Correlation Functions

Yonatan Kahn, University of Illinois Urbana-Champaign

Abstract Details to come

10:15–11:00 am ET

Transformers to transform Scattering Amplitudes Calculation

Tianji Cai, SLAC

Abstract AI for fundamental physics is now a burgeoning field, with numerous efforts pushing the boundaries of experimental and theoretical physics. In this talk, I will introduce a recent innovative application of Natural Language Processing to state-of-the-art calculations for scattering amplitudes. Specifically, we use Transformers to predict the symbols at high loop orders of the three-gluon form factors in planar N=4 Super Yang-Mills theory. Our results have demonstrated great promises of Transformers for amplitude calculations, opening the door for an exciting new scientific paradigm where discoveries and human insights are inspired and aided by AI.

11:00-11:30 am ET

Break

11:30 am–12:15 pm ET

Neural ansatze for physics and physics of neural networks

Nima Dehmamy, IBM Research MIT-IBM Lab

Abstract I will discuss some of our recent works on using ML to solve physics problems and using physics to understand ML. For the former, I will talk about using a "neural ansatz" for physics simulations and our work on gauge equivariant networks. For the latter, I will discuss our work on parameter space symmetries and conservation laws, as well as some work in progress on transformers.

12:15–1:30 pm ET

Lunch

1:30–3:00 pm ET

Contributed Talks: Session A - Uncertainty Quantification/Robust AI

Bartos Theater

Jolideco: A Hybrid ML-Statistical Approach for Robust Image Deconvolution in Sparse Poisson Regimes, Axel Donath (Center for Astrophysics | Harvard & Smithsonian) Machine learning for sparse image data reconstruction remains challenging, particularly in Astronomy where ground truth is often unavailable. While simulations and transfer learning offer partial solutions, high-dimensional parameter spaces can render these approaches computationally expensive or infeasible. Moreover, in low-count Poisson domains, quantifying uncertainties is crucial. We present Jolideco, a novel hybrid method for joint likelihood image deconvolution that synergizes machine learning with classical statistical modeling. This approach leverages a hand-crafted forward model for the imaging process, incorporating prior information such as telescope characteristics and noise distributions. Simultaneously, it employs an high-dimensional, patch-based image prior trained via ML on astronomical images from other wavelengths to regularize image structure. Jolideco demonstrates significantly improved reconstruction quality across diverse source scenarios and signal-to-noise regimes. Its closed statistical framework facilitates multi-telescope data integration and robust uncertainty quantification. We showcase Jolideco's effectiveness using example data from the Chandra X-ray Observatory and the Fermi-LAT Gamma-ray Space Telescope, illustrating its potential to advance astronomical image analysis in the Poisson regime.
Towards Quantitatively Trustworthy AI, Nicholas Kersting (Visa, Inc.) Safe and effective application of AI to Science and Industry can only proceed through measuring trustworthiness quantitatively such that we may track and report progress. Traditional statistical metrics such as Precision, Recall, AUC, etc., no longer sufficient on their own, are supplemented with measures of reliability such as Explainable AI (XAI), most recently in Large Language Model Groundedness and Hallucination --- we report especially on progress in this latter in recent research and applications at Visa.
Evidence-based Inverse Problem Solvers for QCD: Demystifying Uncertainty in Inverse Problem Solutions of Parton Distribution Functions, Brandon Kriesten (Argonne National Laboratory) Representing parton distribution functions (PDFs) of hadrons through robust, high-fidelity parameterizations has been a long-standing goal of particle physics phenomenology. Additionally, quantitatively connecting the underlying theory assumptions and chosen fitted datasets to the properties of the PDF’s flavor and x-dependence is a long-standing challenge. We use a variational autoencoder-based inverse mapper to find solutions to the inverse problem of decoding PDFs from experimental measurements / lattice QCD data while simultaneously dissecting patterns of learned correlations between the encoded data and reconstructed PDFs. Finally using evidence-based techniques, we seek to quantify the uncertainty of these models and separate data (aleatoric) and knowledge (epistemic) uncertainty while identifying out of distribution samples. I will show progress towards implementing these evidence-based inverse problem solvers for PDFs in an implementation that mirrors a phenomenological fit.
Simulation Based Inference for FCC-ee, Lingfeng Li (Brown University) We apply machine-learning techniques to the effective-field-theory analysis of the e+e−→W+W− processes at future lepton colliders, and demonstrate their advantages in comparison with conventional methods, such as optimal observables. In particular, we show that machine-learning methods are more robust to detector effects and backgrounds, and could in principle produce unbiased results with sufficient Monte Carlo simulation samples that accurately describe experiments. This is crucial for the analyses at future lepton colliders given the outstanding precision of the e+e−→W+W− measurement (∼O(10−4) in terms of anomalous triple gauge couplings or even better) that can be reached. Our framework can be generalized to other effective-field-theory analyses, such as the one of e+e−→tt¯ or similar processes at muon colliders.
Embed and Emulate: Contrastive representations for simulation-based inference, Peter Lu (University of Chicago) Scientific modeling and engineering applications rely heavily on parameter estimation methods to fit physical models and calibrate numerical simulations using real-world data. In the absence of an analytic statistical model, modern simulation-based inference (SBI) approaches first use a numerical simulator to generate a dataset consisting of parameters and corresponding model outputs, such as trajectories from a dynamical system. Then, given real experimental data, the system parameters can be inferred using a variety of SBI methods, some of which use machine learning emulators to accelerate data generation and inference. However, parameter estimation for dynamical systems, such as weather and climate, is still often difficult due to the high-dimensional nature of the data as well as the complexity of the physical models and simulations. We introduce Embed and Emulate (E&E): a new likelihood-free inference method for estimating arbitrary parameter posteriors based on contrastive learning. E&E learns a low-dimensional embedding for the data (i.e. a summary statistic) and a corresponding fast emulator in the embedding space, bypassing the need for running an expensive simulation or a high-dimensional emulator during inference. We validate our theoretical results on an synthetic toy experiment, which illustrates properties of the learned embedding as a contrastive representation, and then benchmark E&E on a realistic multimodal parameter estimation task using the high-dimensional, chaotic Lorenz 96 system.
Going beyond the jet tagging frontier using knowledge distillation, Yuanchen Zhou (Brown University) Classifying jets for proton-proton collisions is a challenging problem, and several Artificial Intelligence / Machine Learning classifiers have been introduced to help handle the task. Different classifiers have tradeoffs in terms of their accuracy, model dependency, processing time, etc. We study these tradeoffs for different model architectures, and explore techniques to improve their overall performance. In particular, we study the technique of Knowledge Distillation, which distills knowledge from a complex model with high accuracy to a simpler model with faster processing time and potentially less model-dependence to see if it is possible to increase the accuracy of the simpler model while maintaining its other advantages.

Contributed Talks Session B - Representation/Manifold Learning

MIT Media Lab, Room 633

Multi-modal generalized class discovery for scalable autonomous all-sky surveys, Laura Domine (Center for Astrophysics, Harvard University) The Galileo Project is a systematic scientific research program focused on understanding the origins and nature of Unidentified Aerial Phenomena (UAP). To date there is very little data on UAP whose properties and kinematics purportedly reside outside the performance envelope of known phenomena. We are in the process of designing, building and commissioning a multi-modal, multi-spectral detector to continuously monitor the sky and collect UAP data through a rigorous aerial census of natural and human-made phenomena. This open-world setting is a major challenge for artificial intelligence (AI) techniques which need to both (i) accurately detect and classify objects from known classes and (ii) cluster unknown, out-of-distribution objects. Using a commissioning dataset, which includes several months of videos from an all-sky array of eight long wave-infrared cameras and audible recordings, I will discuss our work developing a multi-modal generalized class discovery method to automatically identify new classes of objects in unlabeled data in addition to known classes. It opens the door to an autonomous aerial census where categorization relies less on our prior expectations.
SPECTER: Efficient Evaluation of the Spectral EMD, Rikab Gambhir (MIT) The Energy Mover’s Distance (EMD) has seen use in collider physics as a metric between events and as a geometric method of defining infrared and collinear safe observables. Recently, the spectral Energy Mover’s Distance (SEMD) has been proposed as a more analytically tractable alternative to the EMD. In this work, we obtain a closed-form expression for the Riemannian-like p = 2 SEMD metric between events, eliminating the need to numerically solve an optimal transport problem. Additionally, we show how the SEMD can be used to define event and jet shape observables by minimizing the metric between event and parameterized energy flows (similar to the EMD), and we obtain closed-form expressions for several of these observables. We also present the SPECTER framework, an efficient and highly parallelized implementation of the SEMD metric and SEMD-derived shape observables. We demonstrate that the SEMD and SPECTER provide nearly thousand-fold compute time improvements over evaluation of the EMD.
Hybrid Physics-AI for efficient bias-aware state estimation, Stiven Briand God Massala Moussounda (NTU Singapore, ENS Paris-Saclay) We consider the problem of optimal recovery of an element $u$ of a Hilbert space \mathcal{H} from noisy measurements $\ell_i(u)$. Specifically, $u$ is solution of a biased parametric partial differential equation \(\mathcal{P}( u, \mu) \) and measurements $\ell_i(u)$ are linear functionals on \mathcal{H}. We propose a bias-aware Hybrid-AI approach to solve the optimal recovery by combining the Parameterized Background Data-Weak(PBDW) with the deep neural operator (Deeponet) \cite{lulu}. PBDW combines the model \(\mathcal{P}\) and the measurement in a weak form and estimate the state and the model's bias as a combination of anticipated(Knowledge) and unanticipated(Ignorance) uncertainty. The anticipated uncertainty belongs to a background space $\mathcal{Z}_N$ built from a reduced model of a best-knowledge manifold \(\mathcal{M}^{\mathrm{bk}} =\{u(\mu), \for \mu \in \mathcal{D} \}\), while the unanticipated uncertainty modeled by a Deeponet belongs to $\mathcal{Z}_{N}^{\perp}$. By integrating Deeponet in the PBDW sate estimate, Deeponet lies inside the kernel of the anticipated physics thus strictly accommodates the deficient physics by locally learning the model bias. The local information comes from an optimal sensor selection strategy. To showcase its potential for solving complex physical systems, we apply this method on a 2D Helmoltz equation defined on the physical domain $\Omega$ with various model's bias from the source, boundary conditions or both.
Parameter Symmetry and Formation of Latent Representations, Liu Ziyin (MIT, NTT Research) Symmetries exist abundantly in the loss function of neural networks. We characterize the learning dynamics of stochastic gradient descent (SGD) when exponential symmetries, a broad subclass of continuous symmetries, exist in the loss function. We establish that when gradient noises do not balance, SGD has the tendency to move the model parameters toward a point where noises from different directions are balanced. Here, a special type of fixed point in the constant directions of the loss function emerges as a candidate for solutions for SGD. As the main theoretical result, we prove that every parameter connects without loss function barrier to a unique noise-balanced fixed point. Lastly, we discuss how the theory can be leveraged to understand common phenomena in deep learning, such as progressive sharpening and flattening and the formation of latent representations.

3:00–3:30 pm ET

Break

3:30–4:15 pm ET

Applications of Neural Networks to Mitigate Unique Challenges in Neutrino Experiments

Jessie Micallef, IAIFI Fellow

Abstract Details to come

4:15-5:00 pm ET

Equivariant Convolutional Networks & Group Steerable Kernels

Maurice Weiler, MIT

Abstract Equivariance imposes symmetry constraints on the connectivity of neural networks. This talk investigates the case of equivariant networks for feature vector fields or point clouds, which generally requires 1) spatial (convolutional) weight sharing, and 2) G-steerability constraints on the shared weights themselves. It gives an intuition for steerable convolution kernels, discusses how they can be implemented directly via harmonic bases or implicitly via equivariant MLPs, and clarifies the relation to typical message passing operations in equivariant MPNNs. A gauge theoretic formulation of equivariant CNNs and MPNNs shows that these models are not only equivariant under global transformations, but under more general local gauge transformations as well.

5:00-5:30 pm ET

Break

5:30-7:30 pm ET

Workshop Dinner, MIT Schwarzman College of Computing (51 Vassar St, Cambridge), 8th Floor

Friday, August 16, 2024

9:30–10:15 am ET

Neural Networks and Conformal Field Theory

Jim Halverson, Northeastern/IAIFI

Abstract I'll present an essential result in ML theory, explain how it motivates a new approach to field theory, and present some key findings. Next, I'll discuss new work, explaining a result of Dirac on the relationship between Lorentz invariance and conformal invariance, and how this can be applied in neural networks for constructing new conformal field theories.

10:15–11:00 am ET

How good is your model — Goodness-of-fit by Neyman-Pearson testing

Gaia Grosso, IAIFI Fellow

Abstract The Neyman-Pearson strategy for hypothesis testing can be employed for goodness-of-fit if the alternative hypothesis is selected from data by exploring a rich parametrised family of models. The New Physics Learning Machine (NPLM) methodology has been developed as a concrete implementation of this idea, to target the detection of new physical effects in multidimensional and unbinned collider data. The applications of the Neyman-Pearson test as a goodness-of-fit method extend beyond new physics discovery, to problems of data quality monitoring and, crucially, generative models validation. In this talk I will discuss the main challenges behind the practical use of the Neyman-Pearson strategy in real setups, such as model selection, uncertainty quantification and scalability, and I will present recent solutions and future prospect to tackle them.

11:00-11:30 am ET

Break

11:30 am–12:15 pm ET

Generative AI and the natural sciences: Governance strategies and historical perspectives

David Kaiser, MIT

Abstract Generative AI techniques offer many exciting opportunities for researchers across the natural sciences and beyond. Like any new technologies, however, these tools can also lead to unanticipated problems. Therefore it is imperative to identify — and work to avoid or ameliorate — potential harms. Doing so requires coordination among the research community as well as with individuals and groups who are not themselves scientists. Recent history provides several examples of how once-new technologies have been managed by wide-ranging constituencies to advance the greater good. This talk will conclude by describing guidance for protecting scientific integrity in an age of generative AI, which was recently developed by a working group of the US National Academy of Sciences.

12:15–1:30 pm ET

Lunch

1:30–2:15 pm ET

Compiling Learning onto Physical Systems

Dirk Englund, MIT

Abstract The hardware limitations of conventional electronics in deep learning applications have spurred exploration into physical architectures fundamentally different from today’s computers. This talk covers the scalability and performance metrics—such as throughput, energy consumption, and latency—of emerging optical and opto-electronic architectures, with a focus on recently developed hardware error correction techniques, in-situ training methods and initial field trials, as well as methods leveraging quantum information science to perform learning and inference in ways not currently possible.

2:15–3:00 pm ET

ML-based modeling and control to enable new capabilities in beam customization and control at particle accelerator scientific user facilities

Auralee Edelen, SLAC

Abstract Particle accelerators are incredibly complicated machines that are used for numerous applications in science, industry, and medicine. At scientific user facilities driven by particle accelerators, it is often the case that custom particle beams must be generated on demand. Simultaneously, increasingly tight tolerances and difficult-to-achieve beam characteristics are needed to meet the needs of future applications of accelerators and unlock new experimental capabilities. This is a highly complicated, nonlinear control problem that involves precise shaping of the beam in 6D position-momentum phase space. In this talk I will discuss how ML based modeling and control is beginning to transform how beam control is conducted at accelerator facilities that require highly flexible beam customization. This includes the development of digital twins for accelerator systems, improving accelerator system models using differentiable simulations and other hybrid ML and physics approaches, physics-informed Bayesian optimization, reinforcement learning, and ML enhanced beam diagnostics. The talk will focus on examples from LCLS, LCLS-II, FACET-II, and MeV-UED at SLAC, and the APS and AWA at Argonne National Lab, all major scientific user facilities.

3:00–3:30 pm ET

Closing

Speakers

Pulkit Agrawal
Assistant Professor, EECS, MIT
Earl Bellinger
Assistant Professor, Department of Astronomy, Yale University
Earl Bellinger
Research Associate, SLAC National Accelerator Laboratory
Carolina Cuesta-Lazaro
IAIFI Fellow, IAIFI
Nima Dehmamy
Research Staff Member, IBM Research
Auralee Edelen
Associate Scientist, SLAC National Accelerator Laboratory
Dirk Englund
Associate Professor, MIT
Gaia Grosso
IAIFI Fellow
Jim Halverson
Associate Professor, Physics, Northeastern
Yonatan Kahn
Assistant Professor, Theoretical Physicist, UIUC
Verena Kain
Scientist, CERN
David Kaiser
Professor, History of Science/Physics, MIT
Patrick Kidger
Mathematician and Machine Learning Researcher, Cradle.bio
J. Nathan Kutz
Professor, University of Washington
Laurence Levasseur
Assistant Professor, University of Montreal
Ziming Liu
Grad Student, MIT
Alessandro Lovato
Physicist, Argonne National Laboratory
Lu Lu
Assistant Professor, Yale University
Jessie Micallef
IAIFI Fellow, IAIFI
Ayan Paul
Research Scientist, The Institute for Experiential AI - Northeastern University
Mariel Pettee
Chamberlain Postdoctoral Research Fellow, Lawrence Berkeley National Lab
Tilman Plehn
Professor, ITP - Heidelberg University
Matt Schwartz
Professor, Harvard
Melanie Weber
Assistant Professor of Applied Mathematics and of Computer Science, Harvard
Maurice Weiler
Deep Learning Researcher, University of Amsterdam
Christoph Weniger
Associate Professor, University of Amsterdam
Rose Yu
Assistant Professor, UC San Diego department of Computer Science and Engineering.


Accommodations

We have secured discounted rates at the following hotels:

  • Royal Sonesta Boston, 40 Edwin H Land Blvd, Cambridge, MA 02142.

    $224 nightly rate (1-2 people per room)

    Deadline to book: July 29

    Book now

Workshop attendees are also welcome to book dorms for a discounted rate at Boston University:

  • 10 Buick Street, Boston

    $97.50 nightly rate (1 person per room, shared bathroom with 1 other person)

    Book now

FAQ

  • Who can attend the Summer Workshop? Any researcher working at or interested in the intersection of physics and AI is encouraged to attend the Summer Workshop.
  • What is the cost to attend the Summer Worskhop? The registration fee for the Summer Workshop is 200 USD and includes a welcome dinner, as well as coffee breaks and snacks.
  • If I come to the Summer School, can I also attend the Workshop? Yes! We encourage you to stay for the Workshop and you can stay in the dorms for both events if you choose (at your expense).
  • Will the recordings of the talks be available? We plan to share the talks on our YouTube channel.

Submit a question or comment

2024 Organizing Committee

  • Fabian Ruehle, Chair (Northeastern University)
  • Demba Ba (Harvard)
  • Alex Gagliano (IAIFI Fellow)
  • Di Luo (IAIFI Fellow)
  • Polina Abratenko (Tufts)
  • Owen Dugan (MIT)
  • Sneh Pandya (Northeastern)
  • Yidi Qi (Northeastern)
  • Manos Theodosis (Harvard)
  • Sokratis Trifinopoulos (MIT)