Stay tuned for updates about our 2025 IAIFI Summer Workshop. Join our mailing list to receive updates.
The IAIFI Summer Workshop brings together researchers from across Physics and AI for plenary talks, poster sessions, and networking to promote research at the intersection of Physics and AI.
Many of the videos from the 2024 IAIFI Summer Workshop are now posted on the IAIFI YouTube channel.
Many of the speakers’ slides from the 2024 IAIFI Summer Workshop are now available online.
 The 2024 Summer Workshop was held August 12–16, 2024
 Location: Bartos Theater, MIT List Visual Arts Center, Lower Level (20 Ames Street, Cambridge)
 Registration deadline: July 31, 2024
Here’s what attendees at previous IAIFI Summer Workshops had to say about the experience:
Videos of the plenary talks from the 2023 IAIFI Summer Workshop are now available on YouTube.
Agenda Speakers FAQ Past Workshops Accommodations
About
The Institute for Artificial Intelligence and Fundamental Interactions (IAIFI) is enabling physics discoveries and advancing foundational AI through the development of novel AI approaches that incorporate first principles, best practices, and domain knowledge from fundamental physics. The goal of the Workshop is to serve as a meeting place to facilitate advances and connections across this growing interdisciplinary field.
View recommendations for meals and activities around MIT
Agenda
Monday, August 12, 2024
9:159:30 am ET
Welcome
9:30–10:15 am ET
10,000 Einsteins: AI and the future of theoretical physics
Matt Schwartz, Harvard/IAIFI
Abstract
AI has already proved revolutionary in many areas of physics, particularly those focused on data analysis. However, machines are also advancing rapidly in symbolic tasks. As much of what is done in theoretical physics is symbolic, there is tremendous potential for machines to transition from data analysis to formal theoretical work. This talk will discuss some initial progress in this direction and a vision for how machines and humans might collaborate in the future to solve some of the most challenging problems in fundamental physics.10:15–11:00 am ET
Dynamic Models from Data
Nathan Kutz, University of Washington
Abstract
Physics based models and governing equations dominate science and engineering practice. The advent of scientific computing has transformed every discipline as complex, highdimensional and nonlinear systems could be easily simulated using numerical integration schemes whose accuracy and stability could be controlled. With the advent of machine learning, a new paradigm has emerged in computing whereby we can build models directly from data. In this work, integration strategies for leveraging the advantages of both traditional scientific computing and emerging machine learning techniques are discussed. Using domain knowledge and physicsinformed principles, new paradigms are available to aid in engineering understanding, design and control.11:0011:30 am ET
Break
11:30 am–12:15 pm ET
Accurate, efficient, and reliable learning of deep neural operators for multiphysics and multiscale problems
Lu Lu, Yale University
Abstract
It is widely known that neural networks (NNs) are universal approximators of functions. However, a less known but powerful result is that a NN can accurately approximate any nonlinear operator. This universal approximation theorem of operators is suggestive of the potential of deep neural networks (DNNs) in learning operators of complex systems. In this talk, I will present the deep operator network (DeepONet) to learn various operators that represent deterministic and stochastic differential equations. I will also present several extensions of DeepONet, such as DeepM&Mnet for multiphysics problems, DeepONet with proper orthogonal decomposition or Fourier decoder layers, MIONet for multipleinput operators, and multifidelity DeepONet. I will demonstrate the effectiveness of DeepONet and its extensions to diverse multiphysics and multiscale problems, such as bubble growth dynamics, highspeed boundary layers, electroconvection, hypersonics, geological carbon sequestration, and full waveform inversion. Deep learning models are usually limited to interpolation scenarios, and I will quantify the extrapolation complexity and develop a complete workflow to address the challenge of extrapolation for deep neural operators.12:15–1:30 pm ET
Lunch
1:30–3:00 pm ET
Contributed Talks Session A  Representation/Manifold Learning
Bartos Theater
Symmetries and neural tangent kernels: using physical principles to understand deep learning, Jan Gerken (Chalmers University of Technology)
Despite its extraordinary success in applications, a thorough theoretical understanding of deep learning is still lacking, making progress depend largely on costly trialanderror procedures. At the same time, theoretical physics has a long history of developing deep mathematical understanding of complex systems. In this talk, I will present some recent work on how techniques from theoretical physics can be used to deepen our understanding of deep learning and lead to practically relevant insights. In particular, symmetries, which are an established cornerstone of theoretical physics, have reached widespread popularity as a guiding principle in deep learning as well. In machine learning, symmetries feature most importantly in the form of data augmentation and equivariant neural networks. At the same time, neural tangent kernels, which are closely related to statistical field theory, have emerged as a powerful tool to understand neural networks both at initialization and during training. Combining these paradigms leads to practically relevant statements in deep learning. Furthermore, it opens the door towards further deepening the connecting between theoretical physics and our understanding of neural networks.Approximatelysymmetric neural networks for quantum spin liquids, Dominik Kufel (Harvard University)
We propose and analyze a family of approximatelysymmetric neural networks for quantum spin liquid problems. These tailored architectures are parameterefficient, scalable, and significantly outperform existing symmetryunaware neural network architectures. Utilizing the mixedfield toric code model, we demonstrate that our approach is competitive with the stateoftheart tensor network and quantum Monte Carlo methods. Moreover, at the largest system sizes (N=480), our method allows us to explore Hamiltonians with sign problems beyond the reach of both quantum Monte Carlo and finitesize matrixproduct states. The network comprises an exactly symmetric block following a nonsymmetric block, which we argue learns a transformation of the ground state analogous to quasiadiabatic continuation. Our work paves the way toward investigating quantum spin liquid problems within interpretable neural network architectures.Title to come, Simonetta Liuti (The University of Virginia)
Abstract to comeA Neural Net Model for Distillation with Weights Explained, Berfin Simsek (NYU/Flatiron Institute)
It is important to understand how large models represent knowledge to make them efficient and safe. We study a toy model of neural nets that exhibits nonlinear dynamics and phase transition. Although the model is complex, it allows finding a family of the socalled "copyaverage" critical points of the loss. The gradient flow initialized with random weights consistently converges to one such critical point for networks up to a certain width, which we proved to be optimal among all copyaverage points. Moreover, we can explain every neuron of a trained neural network of any width. As the width grows, the network changes the compression strategy and exhibits a phase transition. We close by listing open questions calling for further mathematical analysis and extensions of the model considered here.PhysicsMotivated Optimization
Beyond Closure Models: Estimating Longterm Statistics of ChaoticSystems via PhysicsInformed Neural Operators, Chuwei Wang (Caltech)
Accurately predicting the longterm behavior of chaotic systems is important in many applications. This requires iterative computations on a dense spatiotemporal grid to account for the unstable nature of chaotic systems, which is expensive and impractical in many realworld scenarios. The alternative approach to such a fullresolved simulation is using a coarse grid and then correcting its errors through a 'closure model', which approximates the overall information from fine scales not captured in the coarsegrid simulation. Recently, ML approaches have been used for closure modeling, but they typically require a large number of training samples from expensive fullyresolved simulations (FRS). In this work, through the lens of Liouville flow in function spaces, we prove an even more fundamental limitation, viz., the standard approach to learning closure models suffers from a large approximation error for generic problems, no matter how large the model is, and it stems from the nonuniqueness of the mapping. We propose an alternative endtoend learning approach using a physicsinformed neural operator (PINO) that overcomes this limitation by not using a closure model or a coarsegrid solver. We first train the PINO model on data from a coarsegrid solver and then finetune it with (a small amount of) FRS and physicsbased losses on a fine grid. The discretizationfree nature of neural operators means that they do not suffer from the restriction of a coarse grid that closure models face, and they can provably approximate the longterm statistics of chaotic systems. In our experiments on fluid dynamics, our PINO model achieves a 120x speedup compared to FRS with a relative error ~5%. In contrast, the closure model coupled with a coarsegrid solver is 58x slower than PINO while having a much higher error 205% when the closure model is trained on the same FRS dataset.Determining Heterogeneous Elastic Properties of Soft Materials using PhysicsInformed Neural Networks, Wensi Wu (Children's Hospital of Philadelphia)
The heterogeneous mechanical properties found in biological materials have profound implications for both engineering and medical applications. Within the engineering community, these properties are frequently studied to guide the design of mechanical devices such as artificial organs and soft robots. Concurrently, in the medical field, the mechanical properties of tissues play a crucial role in providing diagnostic information about various diseases and conditions. The significance of material mechanical properties across these diverse domains has driven a need to better understand the underlying mechanisms governing the microscopic properties of biological tissues and their associated functions, whether for improving material designs or disease diagnosis. In traditional engineering, identifying unknown material parameters requires iterative inverse finite element analyses and optimization of the constitutive parameters until the finite element model achieves an acceptable level of mechanical response, aligning with experimental data. While this method is efficient with homogeneous materials, optimizing the elasticity map of heterogeneous materials is challenging. In this work, we propose using physicsinformed neural networks (PINNs) to identify the fullfield elastic properties of highly nonlinear, hyperelastic materials. We applied our improved PINNs to six structurally complex materials and three constitutive material models (NeoHookean, MooneyRivlin, and Gent) to evaluate the accuracy of fullfield elasticity maps estimated by PINNs. Our PINN model consistently produced highly accurate estimates of the fullfield elastic properties, even when there was up to 10% noise present in the training data.Contributed Talks Session B  Generative Models
MIT Media Lab, Room 633
Machine learning phase transitions: A probabilistic perspective, Julian Arnold (University of Basel)
The identification of phase transitions and the classification of different phases of matter from data are among the most popular applications of machine learning in physics. Neural network (NN)based approaches have proven to be particularly powerful due to the ability of NNs to learn arbitrary functions. Many such approaches work by computing indicators of phase transitions from the output of NNs trained to solve specific classification problems. In this talk, I will derive the optimal solutions to these classification problems given by Bayes classifiers that take into account the probability distributions underlying the physical system under consideration [1]. This probabilistic viewpoint allows us to gain a deeper understanding of previous NNbased studies, highlighting the strengths and weaknesses of individual methods [1], enables us to root the methods in information theory [2], yields more efficient numerical routines based on the incorporation of readily available generative models [3], and widens the application domain of these methods to systems outside physics (such as diffusion models or transformers) [4,5]. [1] J. Arnold and F. Schäfer, PRX 12, 031044 (2022) [2] J. Arnold et al., arXiv:2311.10710 (2023) [3] J. Arnold et al., PRL 132, 207301 (2024) [4] J. Arnold et al., arXiv:2311.09128 (2023) [5] J. Arnold et al., arXiv:2405.17088 (2024)Accelerating Molecular Discovery with Machine Learning, Yuanqi Du (Cornell University)
Recent advancements in machine learning have paved the way for groundbreaking opportunities in the realm of molecular discovery. At the forefront of this evolution are improved computational tools with proper inductive biases and efficient optimization. In this talk, I will delve into our efforts around these themes from a geometry, sampling and optimization perspective. I will first introduce how to encode symmetries in the design of neural networks and the balance of expressiveness and computational efficiency. Next, I will discuss how generative models enable a wide range of design and optimization tasks in molecular discovery. In the third part, I will talk about how the advancements in stochastic optimal control, sampling and optimal transport can be applied to find transition states in chemical reactions.Understanding Diffusion Models by Feynman's Path Integral, Yuji Hirono (Osaka University)
Scorebased diffusion models have proven effective in image generation and have gained widespread usage. We introduce a novel formulation of diffusion models using Feynman's path integral [1]. We find this formulation providing comprehensive descriptions of scorebased generative models, and demonstrate the derivation of backward stochastic differential equations and loss functions.The formulation accommodates an interpolating parameter connecting stochastic and deterministic sampling schemes, and we identify this parameter as a counterpart of Planck's constant in quantum physics. This analogy enables us to apply the WentzelKramersBrillouin (WKB) expansion, a wellestablished technique in quantum physics, for evaluating the negative loglikelihood to assess the performance disparity between stochastic and deterministic sampling schemes. Reference: [1] Yuji Hirono, Akinori Tanaka, Kenji Fukushima, accepted in ICML2024 [arXiv:2403.11262].Neural Entropy, Akhil Premkumar (University of Chicago)
What is the smallest neural network that can do a particular task? To answer this question we need to understand the capacity of neural networks to encode and store information. In the context of generative diffusion models, we show that it is possible to identify the entropy of the network, which characterizes precisely its storage capacity.Predicting Missing Regions in Charged Particle Tracks Using a Sparse 3D Convolutional Neural Network, Hilary Utaegbulam (University of Rochester)
The 2x2 Demonstrator is a prototype of NDLAr, the liquid argon timeprojection chamber of the Deep Underground Neutrino Experiment’s Near Detector complex. Both the 2x2 Demonstrator and NDLAr are modular detectors that will have pixelated charge readouts and inactive regions wherein there is no sensitivity to charge deposition and light signals that arise from charged particle interactions with liquid argon. In the 2x2, these inactive regions are located in between the active detector modules, which introduces the challenge of inferring what charge signals ought to look like in these regions. This study explores the use of a Sparse 3D Convolutional Neural Network (ConvNet) to infer missing regions in charged particle tracks. Hits corresponding to energy depositions are voxelized into a threedimensional grid for each track. Voxels that fall into predefined inactive regions are removed to simulate the lack of detector output. The model is trained to infer the topology of the missing track voxels, with the ultimate goal of inferring the missing charge or energy values in these voxels as well. Results indicate that this approach shows promise in prediction of missing track regions with some accuracy.3:00–3:30 pm ET
Break
3:30–4:15 pm ET
What Do Language Models Have To Say About Fundamental Physics?
Mariel Pettee, LBNL/Flatiron
Abstract
The launch of ChatGPT in November 2022 ignited an ongoing worldwide conversation about the possible impacts of Large Language Models (LLMs) on the way we work. As scientists, however, the changes in our workflows since the advent of this technology have been relatively minor. Will this still be the case in 10 years? Could an analogous paradigm shift arise from a foundation model trained on a large amount of scientific data, transforming the way we conduct our research? If so, what can we learn from the development of other foundation models, particularly LLMs, in their evolution from specialists to (quasi)generalists? In this talk, I will present some recent work exploring how language models could help form a foundation model of fundamental physics. I'll also share my perspective on how we should strive to shape such models to reflect our highest priorities as scientists.4:155:00 pm ET
Solving the nuclear manybody problem with neural quantum state
Alessandro Lovato, Argonne National Laboratory
Abstract
Artificial neural networks can be employed to accurately and compactly represent quantum manybody states relevant to many applications, including nuclear physics, quantum chemistry, and condensed matter problems. I will argue that a variational Monte Carlo algorithm based on neuralnetwork quantum states provides a systematically improvable solution to the nuclear Schrödinger equation with a polynomial cost in the number of nucleons. After presenting recent progress in describing atomic nuclei, neutronstar matter, and hypernuclei, I will illustrate an application to condensedmatter systems, specifically ultracold Fermi gases near the unitary limit. Detailed benchmarks with continuum Quantum Monte Carlo methods will be presented.5:00–7:00 pm ET
Poster Session
MIT Media Lab, 6th Floor
Details
 Data Compression and Inference in Cosmology with SelfSupervised Machine Learning, Aizhan Akhmetzhanova (Harvard University)
 CNN and Transformer architecture for jets events classification, Juvenal Bassa (University of Puerto Rico  Mayaguez)
 DataDriven Discovery of Xray Transients with Machine Learning, Steven Dillmann (University of Cambridge)
 Sampling Transition Dynamics with Machine Learning Approaches, Yuanqi Du (Cornell University)
 MultiModal Generalized Class Discovery for Scalable Autonomous AllSky Surveys, Sriram Elango (Harvard University)
 Inverse Design of Complex Fluids with FullyDifferentiable Lagrangian Particle Dynamics, Kaylie Hausknecht (Harvard University and MIT)
 Perfect Jet Classification Through Equivariant Regression, Timothy Hoffman (University of Chicago)
 FlowBased Generative Emulation of Grids of Stellar Evolutionary Models, Marc Hon (MIT Kavli Institute for Astrophysics and Space Research)
 Enhancing Cosmological Simulations with Efficient and Interpretable Machine Learning in the Wavelet Basis, Cooper Jacobus (UC Berkeley: Dept. Astrophysics, Lawrence Berkeley National Lab: Computational Cosmology Center)
 Training neural operators to preserve invariant measures of chaotic attractors, Ruoxi Jiang (University of Chicago)
 Hidden Giants: Redefining QSO Classification and Outlier Detection with Redshift Invariant Autoencoders, Thaddaeus Kiker (Columbia University)
 KAN: KolmogorovArnold Networks, Ziming Liu (MIT, IAIFI)
 Phase Transitions in the Output Distribution of Large Language Models, Niels Loerch (University of Basel)
 Tackling reasoning problems with AI, Rishabh Mallik (Forschungszentrum Jülich)
 Recurrent Features of Amplitudes in Planar N = 4 Super YangMills Theory, Garrett Merz (University of WisconsinMadison)
 Ultrafast Jet Classification using Geometric Learning, Patrick Odagiu (ETH Zurich)
 Deep Stochastic Mechanics, Elena Orlova (The University of Chicago)
 Differentiable and Distributional Cosmological Stasis, Sneh Pandya (Northeastern / IAIFI)
 Exploring Astronomical Catalog Crossmatching with Machine Learning, Victor Samuel Perez Diaz (Center for Astrophysics  Harvard & Smithsonian, IAIFI)
 Towards an AIenabled astronomy system: natural language processing of Chandra data archive, Shivam Raval (Harvard University)
 Autodecoding Poisson Processes for Unsupervised Xray Sources Learning, Yanke Song (Harvard University, Department of Statistics)
 Development of photothermal techniques for the detection of cancer biomarkers, Ilhem Soyah (Higher school of sciences and technology of Hammam Sousse)
 MultiModal Contrastive Training for Robust VQA, Mitra Tajrobehkar (Vertical Oceans)
 ZeroShot Classification of Astronomical Images with Large Multimodal Models, Dimitrios Tanoglidis (University of Pennsylvania)
 Vertex finding and jet class classification using Wasserstein Neural Network, Diego F. Vasquez Plaza (Univesity Puerto rico Mayagüez)
 Learning Group Invariant CY Metrics by Fundamental Domain Projections, Moritz Walden (Uppsala University)
 Accelerating Energy Computation in Manyelectron Systems with Forward Laplacian, Chuwei Wang (Caltech)
 Emulating the Effects of PileUp on Xray Spectra, Justina Yang (Harvard University)
 A Variational Continuation Method for Periodic Orbits Using Autograd and Hessian Eigendecompositions, Leo Yao (MIT)
 HyperTagging: Reconstruction of Full Decays using Transformers and Hyperbolic Embedding, Boyang Yu (LMU Munich, Germany)
 Neural scaling laws from largeN field theory, Zhengkang Zhang (University of Utah)
 Revealing the 3D Cosmic Web with Physics Constrained Neural Fields, Brandon Zhao (Caltech)
6:00–8:00 pm ET
Welcome Reception
MIT Media Lab, 6th Floor
Tuesday, August 13, 2024
9:30–10:15 am ET
Trends in AI for particle accelerators
Verena Kain, CERN
Abstract
AI is without doubt radically transforming science with many successful applications in molecular biology, astrophysics, nuclear physics and particle physics. It has enabled significant technological advances for robotics that can particularly enhance a system’s perception, navigational and manipulation abilities and interaction. For control, it enables novel and faster learning/teaching of tasks, replacing or augmenting classical control techniques for hard problems such as realtime control of the nonlinear dynamics of the plasma in a tokamak of a fusion reactor, or navigating drones with superhuman performance. Given the success and types of use cases that can be solved with AI algorithms, accelerator physics and associated technologies have also picked up on AI in the last 5 to 10 years with the number of ML applications steadily rising  and subsequently the number of ML related papers at the big particle accelerator conferences. This contribution will give a brief overview of the typical use cases for AI for particle accelerators, show recent trends and describe the potential and vision of AI for particle accelerators with the emphasis on control and optimisation of particle accelerators.10:15–11:00 am ET
An introduction to neural ODEs in scientific machine learning
Patrick Kidger, Cradle.bio
Abstract
This is an introduction to neural ODEs for scientific applications. The goal is to (a) provide a modelling tool that enhances the expressivity of existing theorydriven approaches, (b) demonstrate that neural ODEs are easy to use via modern autodifferentiable software, and (c) give enough of the tipsandtricks needed to make neural ODEs work in practice!11:0011:30 am ET
Break
11:30 am–12:15 pm ET
Automatic Symmetry Discovery from Data
Rose Yu, UCSD
Abstract
Despite the success of equivariant neural networks in scientific applications, they require knowing the symmetry group a priori. However, it may be difficult to know which symmetry to use as an inductive bias in practice. Enforcing the wrong symmetry could even hurt the performance. In this talk, I will discuss our effort in developing a deep learning framework that can automatically discover symmetry from data. Our framework, LieGAN, represents symmetry as interpretable Lie algebra basis and uses a paradigm akin to generative adversarial training. We further generalized it LaLieGAN to discover nonlinear symmetries from highdimensional data. Empirically, the learned symmetry can also be readily used in existing equivariant neural networks to improve accuracy and generalization in prediction. It can also improve equation discovery and longterm forecasting for various dynamical systems.12:15–1:30 pm ET
Lunch
1:30–3:00 pm ET
Contributed Talks  Session A  Foundational ML
Bartos Theater
Diversity with Similarity as a Measure of Dataset Quality, Josiah Couch (Beth Israel Deaconess Medical Center)
Dataset size and class balance are important measures in deep learning. Maximizing them is seen as a way to ensure that datasets contain diverse images, which models are thought to need in order to generalize well. Yet neither size nor class balance measure image diversity directly, raising the possibility that better measures of dataset quality might exist. To test this hypothesis, we turned to a comprehensive framework of diversity measures that generalizes familiar quantities like Shannon entropy by accounting for the similarities and differences among images. (Size and class balance emerge from this framework as special cases.) We created several thousand diverse datasets by subsampling a variety of large medicalimage datasets representing a range of imaging modalities, trained classifiers on these subsets, and calculated the correlation between subset diversity and model accuracy using diversity measures from the framework.RG flow of the NTK dynamics at finitewidth from Feynman diagrams, Max Guillen (Chalmers University of Technology)
Deep Learning is nowadays a wellstablished method for different applications in science and technology. However, it has been unclear for a long time how the "learning process" actually occurs in different architectures, and how this knowledge could be used to optimize performance and efficiency. Recently, highenergyphysicsbased ideas have been applied to the modelling of Deep Learning, thus translating the learning problem to an RG flow analysis in Quantum Field Theory (QFT). In this talk, I will explain how these quite complicated formulae describing such RG flows for different observables in neural networks at initialization, can be easily obtained from a few rules resembling Feynman rules in QFT. I will also comment on some work in progress which implements such rules for computing higherorder corrections to the frozen (infinitewidth) NTK for particular activation functions, and how they evolve after a few steps of SGD.Supervised learning of infinitelyoverparameterized DNNs through the lens of Wilsonian RG, Anindita Maiti (Perimeter Institute)
The key to the performance of ML algorithms is an ability to segregate relevant features in input datasets from the irrelevant ones. In a setup where data features play the role of an energy scale, we develop a Wilsonian RG framework to integrate out unlearnable modes associated with the Neural Network Gaussian Process (NNGP) kernel, in the regression context. Such a framework in the case of Gaussian features leads to a universal flow of the ridge parameter, whereas, nonGaussianities in data features result in rich inputdependent RG flows. This framework goes beyond the usual analogies between RG flows and learning dynamics, and offers potential improvements to our understanding of feature learning and universality classes of models.Input Space Mode Connectivity in Deep Neural Networks, Jakub Vrabel (CEITEC, Brno University of Technology)
We extend the concept of loss landscape mode connectivity to the input space of deep neural networks. Mode connectivity was originally studied within parameter space, where it describes the existence of lowloss paths between different solutions (loss minimizers) obtained through gradient descent. We present theoretical and empirical evidence of its presence in the input space of deep networks, thereby highlighting the broader nature of the phenomenon. We observe that different input images with similar predictions are generally connected, and for trained models, the path tends to be simple, with only a small deviation from being a linear path. Our methodology utilizes real, interpolated, and synthetic inputs created using the input optimization technique for feature visualization. To prove the existence of general mode connectivity in highdimensional input spaces, we employ percolation theory. We argue that the approximate linear mode connectivity posttraining is a manifestation of some implicit bias. We exploit mode connectivity to obtain new insights about adversarial examples and demonstrate its potential for adversarial detection. Additionally, we discuss applications for the interpretability of deep networks.Neural scaling laws from largeN field theory, Zhengkang Zhang (University of Utah)
Many machine learning models based on neural networks exhibit scaling laws: their performance scales as power laws with respect to the sizes of the model and training data set. We use largeN field theory methods to solve a model recently proposed by Maloney, Roberts and Sully which provides a simplified setting to study neural scaling laws. Our solution extends the result in this latter paper to general nonzero values of the ridge parameter, which are essential to regularize the behavior of the model. In addition to obtaining new and more precise scaling laws, we also uncover a duality transformation at the diagrams level which explains the symmetry between model and training data set sizes. The same duality underlies recent efforts to design neural networks to simulate quantum field theories.Fourierenhanced deep operator network for geophysics with improved accuracy, efficiency, and generalizability, Min Zhu (Yale University)
Full waveform inversion (FWI) and geologic carbon sequestration (GCS) are two significant topics in geophysics. FWI infers subsurface structure information from seismic waveform data by solving a nonconvex optimization problem. On the other hand, solving multiphase flow in porous media is essential for CO2 migration and pressure fields in the subsurface associated with GCS. However, numerical simulations for both FWI and GCS are computationally challenging and expensive due to the highly nonlinear governing partial differential equations (PDEs). Here, we develop a Fourierenhanced deep operator network (FourierDeepONet) to address this issue. For FWI, compared with existing datadriven FWI methods, FourierDeepONet achieves more accurate predictions of subsurface structures across a wide range of source parameters. Additionally, FourierDeepONet demonstrates superior robustness when handling data with Gaussian noise or missing traces. For GCS, compared to the stateoftheart Fourier neural operator (FNO), FourierDeepONet offers superior computational efficiency, with 90% fewer unknown parameters, significantly reduced training time (approximately 3.5 times faster), and much lower GPU memory requirements (less than 35%). Furthermore, FourierDeepONet maintains good accuracy when predicting outofdistribution (OOD) data. This excellent generalizability is enabled by its adherence to the physical principle that the solution to a PDE is continuous over time.Contributed Talks Session B  PhysicsMotivated Optimization
MIT Media Lab, Room 633
Search for new physics using Eventbased anomaly detection at the ATLAS detector of CERN and development of ADFilter tool, Wasikul Islam (University of WisconsinMadison)
Searches for new resonances in twobody invariant mass distributions are performed using an unsupervised anomaly detection technique in events produced in protonproton collisions at a center of mass energy of 13 TeV recorded by the ATLAS detector at the LHC. Studies are conducted in data containing at least one isolated lepton. An autoencoder network is trained with 1% randomly selected collision events and anomalous regions are then defined which contain events with high reconstruction losses from the decoder. Nine invariant mass distributions are inspected which contain pairs of one light jet (or one bjet) and one lepton, photon, or a second light jet (bjet). The 95% confidence level upper limits on contributions from generic Gaussian signals are reported for the studied invariant mass distributions. The obtained modelindependent limits show strong potential to exclude generic heavy states with complex decays.Marginalize, Don't Subtract: Spectral Component Separation for Faint Objects in DESI, Ana Sofia Uzsoy (Harvard University)
Component separation is a critical step in disentangling multiple signals and in extracting useful information from spectra. In this talk, I present MADGICS (Marginalized Analytic Dataspace Gaussian Inference for Component Separation), a datadriven Bayesian component separation technique that can separate a spectrum into any number of Gaussiandistributed components. I then discuss the application of this technique for automatically determining redshifts for Lyman Alpha Emitter (LAE) galaxies observed with DESI while marginalizing over sky residuals to separate sky from target emission lines. We create a covariance matrix from visually inspected DESI LAE targets to provide physically motivated priors, and determine redshift by jointly inferring sky, LAE, and residual components for each individual spectrum. This component separation technique will allow us to create a highquality catalog of LAE spectra and redshifts from DESI data and is also broadly generalizable to other spectral features of interest.A Variational Continuation Method for Periodic Orbits Using Autograd and Hessian Eigendecompositions, Leo Yao (MIT)
We present a Hessianbased approach to numerically continue periodic orbits. Our method offers precise initializations of oscillations around unstable fixed points, an integratorfree variational continuation method, and efficient detection of orbit family intersections and subharmonic bifurcations. Leveraging autograd for computations, we present full continuations of periodic double pendulum oscillations from fixed points and examples of detected bifurcations along these orbit families.Revealing the 3D Cosmic Web with Physics Constrained Neural Fields, Brandon Zhao (Caltech)
Weak gravitational lensing is the slight distortion of galaxy shapes by the gravitational effect of the largescale structure. In our work, we seek to invert the weak lensing signal found in 2D telescope images to obtain a 3D reconstruction of the universe’s dark matter field. While typically this inversion is done in 2D to obtain a projection of the dark matter field, accurate 3D maps of the dark matter distribution are particularly useful as they allow us to detect and localize structures of interest such as galaxy clusters, as well as disambiguate them from intervening matter along the line of sight. This inversion is illposed for several reasons. First, images are only observed from a single viewing angle, which must be inverted into a 3D mass distribution. Second, the exact locations and shapes of unlensed galaxies is in general unknown, and can only be estimated with a degree of uncertainty. This introduces a large amount of noise to our measurement of the lensing signal. We propose a novel methodology using a physicsconstrained, coordinatebased neural field to model the underlying continuous matter distribution. We take an analysisbysynthesis approach, optimizing the weights of the neural network through a fully differentiable physical forward model to reproduce the lensing signal present in image measurements. We showcase reconstruction results on simulated measurements of dark matter distributions from a low resolution NBody particle simulation, and compare our approach with earlier 3D inversion methods.3:00–3:30 pm ET
Break
3:30–4:15 pm ET
KAN: KolmogorovArnold Networks
Ziming Liu, MIT/IAIFI
Abstract
Inspired by the KolmogorovArnold representation theorem, we propose KolmogorovArnold Networks (KANs) as promising alternatives to MultiLayer Perceptrons (MLPs). While MLPs have fixed activation functions on nodes ("neurons"), KANs have learnable activation functions on edges ("weights"). KANs have no linear weights at all  every weight parameter is replaced by a univariate function parametrized as a spline. We show that this seemingly simple change makes KANs outperform MLPs in terms of accuracy and interpretability. For accuracy, much smaller KANs can achieve comparable or better accuracy than much larger MLPs in data fitting and PDE solving. Theoretically and empirically, KANs possess faster neural scaling laws than MLPs. For interpretability, KANs can be intuitively visualized and can easily interact with human users. Through two examples in mathematics and physics, KANs are shown to be useful collaborators helping scientists (re)discover mathematical and physical laws. In summary, KANs are promising alternatives for MLPs, opening opportunities for further improving today's deep learning models which rely heavily on MLPs.4:155:00 pm ET
A Pathway to Robotic Intelligence
Pulkit Agrawal, MIT/IAIFI
Abstract
Details to comeWednesday, August 14, 2024
9:30–10:15 am ET
Navigating Complex Models: Neural Networks for HighDimensional Statistical Inference
Christoph Weniger, University of Amsterdam
Abstract
Details to come10:15–11:00 am ET
DataDriven HighDimensional Inverse Problems: A Journey Through Strong Lensing Data Analysis
Laurence Levasseur, University of Montreal
Abstract
Details to come11:0011:30 am ET
Break
11:30 am–12:15 pm ET
Machine Learning and Physics: The Alliance of the Titans
Ayan Paul, Northeastern
Abstract
Leaps in our understanding of Physics have been concomitant with the adoption of new and increasingly powerful mathematical structures that shift our perspective of how we probe the dynamics of the universe and allow us to unravel complex concepts that were hitherto inaccessible to us. In the realm of datadriven science, where physics is firmly planted, machine learning is proving to be a longawaited and muchneeded mathematical structure that has showcased its worth in aiding landmark discoveries, understanding the underlying symmetries of theories that we propose, and connecting signals to kinematics interpretably, to mention a few. In this parable on the charm of machine learning in physics, we will discuss the nuances of some of these achievements and lay out what we can expect from the future.12:15–1:30 pm ET
Lunch
1:30–2:15 pm ET
Geometric Machine Learning
Melanie Weber, Harvard University
Abstract
A recent surge of interest in exploiting geometric structure in data and models in machine learning has motivated the design of a range of geometric algorithms and architectures. This lecture will give an overview of this emerging research area and its mathematical foundation. We will cover topics at the intersection of Geometry and Machine Learning, including relevant tools from differential geometry and group theory, geometric representation learning, graph machine learning, and geometric deep learning.2:15–3:00 pm ET
Machine Learning for LHC Theory
Tilman Plehn, Heidelberg
Abstract
Details to come3:00–3:30 pm ET
Break
3:30–4:15 pm ET
Asteroseismic probes of farranging astrophysics with big data and machine learning
Earl Bellinger, Yale University
Abstract
Space telescopes like the NASA Kepler and TESS missions as well as the forthcoming PLATO mission are driving a data revolution in stellar astrophysics. The ultraprecise observations provided by these missions are challenging our best models of how stars evolve, and are in turn granting insights into the formation and evolution of planetary systems and the Galaxy as a whole. They furthermore present novel opportunities to probe farranging physics, such as dark matter and theories of gravity beyond general relativity. In this talk, I will give an overview of the data, models, challenges, and opportunities in asteroseismology, and highlight the role that machine learning is playing in advancing our knowledge across astrophysics.4:155:00 pm ET
Big data cosmology meets AI
Carol CuestaLazaro, IAIFI Fellow
Abstract
The upcoming era of cosmological surveys promises an unprecedented wealth of observational data that will transform our understanding of the universe. Surveys such as DESI, Euclid, and the Vera C. Rubin Observatory will provide extremely detailed maps of billions of galaxies out to high redshifts. Analyzing these massive datasets poses exciting challenges that machine learning is uniquely poised to help overcome. In this talk, I will highlight recent examples from my work on probabilistic machine learning for cosmology. First, I will explain how a point cloud diffusion model can be used both as a generative model for 3D maps of galaxy clustering and as a likelihood model for such datasets. Moreover, I will present a generative model developed to reconstruct the initial conditions of the Universe from spectroscopic survey observations. When combined with the wealth of data from upcoming surveys, these machine learning techniques have the potential to provide new insights into fundamental questions about the nature of the universe.5:306:30 pm ET
Panel on Industry–Academia Collaboration

Moderator: Carol CuestaLazaro, IAIFI Fellow

Bill Freeman, Professor of EECS, MIT

Marin Soljacic, Professor of Physics, MIT

Partha Saha, Distinguished Engineer, Data and AI Platform, Visa

Nima Dehmamy, Research Assistant Professor, IBM Research MITIBM Lab
Thursday, August 15, 2024
9:30–10:15 am ET
Uncertainty Quantification from Neural Network Correlation Functions
Yonatan Kahn, University of Illinois UrbanaChampaign
Abstract
Details to come10:15–11:00 am ET
Transformers to transform Scattering Amplitudes Calculation
Tianji Cai, SLAC
Abstract
AI for fundamental physics is now a burgeoning field, with numerous efforts pushing the boundaries of experimental and theoretical physics. In this talk, I will introduce a recent innovative application of Natural Language Processing to stateoftheart calculations for scattering amplitudes. Specifically, we use Transformers to predict the symbols at high loop orders of the threegluon form factors in planar N=4 Super YangMills theory. Our results have demonstrated great promises of Transformers for amplitude calculations, opening the door for an exciting new scientific paradigm where discoveries and human insights are inspired and aided by AI.11:0011:30 am ET
Break
11:30 am–12:15 pm ET
Neural ansatze for physics and physics of neural networks
Nima Dehmamy, IBM Research MITIBM Lab
Abstract
I will discuss some of our recent works on using ML to solve physics problems and using physics to understand ML. For the former, I will talk about using a "neural ansatz" for physics simulations and our work on gauge equivariant networks. For the latter, I will discuss our work on parameter space symmetries and conservation laws, as well as some work in progress on transformers.12:15–1:30 pm ET
Lunch
1:30–3:00 pm ET
Contributed Talks: Session A  Uncertainty Quantification/Robust AI
Bartos Theater
Jolideco: A Hybrid MLStatistical Approach for Robust Image Deconvolution in Sparse Poisson Regimes, Axel Donath (Center for Astrophysics  Harvard & Smithsonian)
Machine learning for sparse image data reconstruction remains challenging, particularly in Astronomy where ground truth is often unavailable. While simulations and transfer learning offer partial solutions, highdimensional parameter spaces can render these approaches computationally expensive or infeasible. Moreover, in lowcount Poisson domains, quantifying uncertainties is crucial. We present Jolideco, a novel hybrid method for joint likelihood image deconvolution that synergizes machine learning with classical statistical modeling. This approach leverages a handcrafted forward model for the imaging process, incorporating prior information such as telescope characteristics and noise distributions. Simultaneously, it employs an highdimensional, patchbased image prior trained via ML on astronomical images from other wavelengths to regularize image structure. Jolideco demonstrates significantly improved reconstruction quality across diverse source scenarios and signaltonoise regimes. Its closed statistical framework facilitates multitelescope data integration and robust uncertainty quantification. We showcase Jolideco's effectiveness using example data from the Chandra Xray Observatory and the FermiLAT Gammaray Space Telescope, illustrating its potential to advance astronomical image analysis in the Poisson regime.Towards Quantitatively Trustworthy AI, Nicholas Kersting (Visa, Inc.)
Safe and effective application of AI to Science and Industry can only proceed through measuring trustworthiness quantitatively such that we may track and report progress. Traditional statistical metrics such as Precision, Recall, AUC, etc., no longer sufficient on their own, are supplemented with measures of reliability such as Explainable AI (XAI), most recently in Large Language Model Groundedness and Hallucination  we report especially on progress in this latter in recent research and applications at Visa.Evidencebased Inverse Problem Solvers for QCD: Demystifying Uncertainty in Inverse Problem Solutions of Parton Distribution Functions, Brandon Kriesten (Argonne National Laboratory)
Representing parton distribution functions (PDFs) of hadrons through robust, highfidelity parameterizations has been a longstanding goal of particle physics phenomenology. Additionally, quantitatively connecting the underlying theory assumptions and chosen fitted datasets to the properties of the PDF’s flavor and xdependence is a longstanding challenge. We use a variational autoencoderbased inverse mapper to find solutions to the inverse problem of decoding PDFs from experimental measurements / lattice QCD data while simultaneously dissecting patterns of learned correlations between the encoded data and reconstructed PDFs. Finally using evidencebased techniques, we seek to quantify the uncertainty of these models and separate data (aleatoric) and knowledge (epistemic) uncertainty while identifying out of distribution samples. I will show progress towards implementing these evidencebased inverse problem solvers for PDFs in an implementation that mirrors a phenomenological fit.Simulation Based Inference for FCCee, Lingfeng Li (Brown University)
We apply machinelearning techniques to the effectivefieldtheory analysis of the e+e−→W+W− processes at future lepton colliders, and demonstrate their advantages in comparison with conventional methods, such as optimal observables. In particular, we show that machinelearning methods are more robust to detector effects and backgrounds, and could in principle produce unbiased results with sufficient Monte Carlo simulation samples that accurately describe experiments. This is crucial for the analyses at future lepton colliders given the outstanding precision of the e+e−→W+W− measurement (∼O(10−4) in terms of anomalous triple gauge couplings or even better) that can be reached. Our framework can be generalized to other effectivefieldtheory analyses, such as the one of e+e−→tt¯ or similar processes at muon colliders.Embed and Emulate: Contrastive representations for simulationbased inference, Peter Lu (University of Chicago)
Scientific modeling and engineering applications rely heavily on parameter estimation methods to fit physical models and calibrate numerical simulations using realworld data. In the absence of an analytic statistical model, modern simulationbased inference (SBI) approaches first use a numerical simulator to generate a dataset consisting of parameters and corresponding model outputs, such as trajectories from a dynamical system. Then, given real experimental data, the system parameters can be inferred using a variety of SBI methods, some of which use machine learning emulators to accelerate data generation and inference. However, parameter estimation for dynamical systems, such as weather and climate, is still often difficult due to the highdimensional nature of the data as well as the complexity of the physical models and simulations. We introduce Embed and Emulate (E&E): a new likelihoodfree inference method for estimating arbitrary parameter posteriors based on contrastive learning. E&E learns a lowdimensional embedding for the data (i.e. a summary statistic) and a corresponding fast emulator in the embedding space, bypassing the need for running an expensive simulation or a highdimensional emulator during inference. We validate our theoretical results on an synthetic toy experiment, which illustrates properties of the learned embedding as a contrastive representation, and then benchmark E&E on a realistic multimodal parameter estimation task using the highdimensional, chaotic Lorenz 96 system.Going beyond the jet tagging frontier using knowledge distillation, Yuanchen Zhou (Brown University)
Classifying jets for protonproton collisions is a challenging problem, and several Artificial Intelligence / Machine Learning classifiers have been introduced to help handle the task. Different classifiers have tradeoffs in terms of their accuracy, model dependency, processing time, etc. We study these tradeoffs for different model architectures, and explore techniques to improve their overall performance. In particular, we study the technique of Knowledge Distillation, which distills knowledge from a complex model with high accuracy to a simpler model with faster processing time and potentially less modeldependence to see if it is possible to increase the accuracy of the simpler model while maintaining its other advantages.Contributed Talks Session B  Representation/Manifold Learning
MIT Media Lab, Room 633
Multimodal generalized class discovery for scalable autonomous allsky surveys, Laura Domine (Center for Astrophysics, Harvard University)
The Galileo Project is a systematic scientific research program focused on understanding the origins and nature of Unidentified Aerial Phenomena (UAP). To date there is very little data on UAP whose properties and kinematics purportedly reside outside the performance envelope of known phenomena. We are in the process of designing, building and commissioning a multimodal, multispectral detector to continuously monitor the sky and collect UAP data through a rigorous aerial census of natural and humanmade phenomena. This openworld setting is a major challenge for artificial intelligence (AI) techniques which need to both (i) accurately detect and classify objects from known classes and (ii) cluster unknown, outofdistribution objects. Using a commissioning dataset, which includes several months of videos from an allsky array of eight long waveinfrared cameras and audible recordings, I will discuss our work developing a multimodal generalized class discovery method to automatically identify new classes of objects in unlabeled data in addition to known classes. It opens the door to an autonomous aerial census where categorization relies less on our prior expectations.SPECTER: Efficient Evaluation of the Spectral EMD, Rikab Gambhir (MIT)
The Energy Mover’s Distance (EMD) has seen use in collider physics as a metric between events and as a geometric method of defining infrared and collinear safe observables. Recently, the spectral Energy Mover’s Distance (SEMD) has been proposed as a more analytically tractable alternative to the EMD. In this work, we obtain a closedform expression for the Riemannianlike p = 2 SEMD metric between events, eliminating the need to numerically solve an optimal transport problem. Additionally, we show how the SEMD can be used to define event and jet shape observables by minimizing the metric between event and parameterized energy flows (similar to the EMD), and we obtain closedform expressions for several of these observables. We also present the SPECTER framework, an efficient and highly parallelized implementation of the SEMD metric and SEMDderived shape observables. We demonstrate that the SEMD and SPECTER provide nearly thousandfold compute time improvements over evaluation of the EMD.Hybrid PhysicsAI for efficient biasaware state estimation, Stiven Briand God Massala Moussounda (NTU Singapore, ENS ParisSaclay)
We consider the problem of optimal recovery of an element $u$ of a Hilbert space \mathcal{H} from noisy measurements $\ell_i(u)$. Specifically, $u$ is solution of a biased parametric partial differential equation \(\mathcal{P}( u, \mu) \) and measurements $\ell_i(u)$ are linear functionals on \mathcal{H}. We propose a biasaware HybridAI approach to solve the optimal recovery by combining the Parameterized Background DataWeak(PBDW) with the deep neural operator (Deeponet) \cite{lulu}. PBDW combines the model \(\mathcal{P}\) and the measurement in a weak form and estimate the state and the model's bias as a combination of anticipated(Knowledge) and unanticipated(Ignorance) uncertainty. The anticipated uncertainty belongs to a background space $\mathcal{Z}_N$ built from a reduced model of a bestknowledge manifold \(\mathcal{M}^{\mathrm{bk}} =\{u(\mu), \for \mu \in \mathcal{D} \}\), while the unanticipated uncertainty modeled by a Deeponet belongs to $\mathcal{Z}_{N}^{\perp}$. By integrating Deeponet in the PBDW sate estimate, Deeponet lies inside the kernel of the anticipated physics thus strictly accommodates the deficient physics by locally learning the model bias. The local information comes from an optimal sensor selection strategy. To showcase its potential for solving complex physical systems, we apply this method on a 2D Helmoltz equation defined on the physical domain $\Omega$ with various model's bias from the source, boundary conditions or both.Parameter Symmetry and Formation of Latent Representations, Liu Ziyin (MIT, NTT Research)
Symmetries exist abundantly in the loss function of neural networks. We characterize the learning dynamics of stochastic gradient descent (SGD) when exponential symmetries, a broad subclass of continuous symmetries, exist in the loss function. We establish that when gradient noises do not balance, SGD has the tendency to move the model parameters toward a point where noises from different directions are balanced. Here, a special type of fixed point in the constant directions of the loss function emerges as a candidate for solutions for SGD. As the main theoretical result, we prove that every parameter connects without loss function barrier to a unique noisebalanced fixed point. Lastly, we discuss how the theory can be leveraged to understand common phenomena in deep learning, such as progressive sharpening and flattening and the formation of latent representations.3:00–3:30 pm ET
Break
3:30–4:15 pm ET
Applications of Neural Networks to Mitigate Unique Challenges in Neutrino Experiments
Jessie Micallef, IAIFI Fellow
Abstract
Details to come4:155:00 pm ET
Equivariant Convolutional Networks & Group Steerable Kernels
Maurice Weiler, MIT
Abstract
Equivariance imposes symmetry constraints on the connectivity of neural networks. This talk investigates the case of equivariant networks for feature vector fields or point clouds, which generally requires 1) spatial (convolutional) weight sharing, and 2) Gsteerability constraints on the shared weights themselves. It gives an intuition for steerable convolution kernels, discusses how they can be implemented directly via harmonic bases or implicitly via equivariant MLPs, and clarifies the relation to typical message passing operations in equivariant MPNNs. A gauge theoretic formulation of equivariant CNNs and MPNNs shows that these models are not only equivariant under global transformations, but under more general local gauge transformations as well.5:005:30 pm ET
Break
5:307:30 pm ET
Workshop Dinner, MIT Schwarzman College of Computing (51 Vassar St, Cambridge), 8th Floor
Friday, August 16, 2024
9:30–10:15 am ET
Neural Networks and Conformal Field Theory
Jim Halverson, Northeastern/IAIFI
Abstract
I'll present an essential result in ML theory, explain how it motivates a new approach to field theory, and present some key findings. Next, I'll discuss new work, explaining a result of Dirac on the relationship between Lorentz invariance and conformal invariance, and how this can be applied in neural networks for constructing new conformal field theories.10:15–11:00 am ET
How good is your model — Goodnessoffit by NeymanPearson testing
Gaia Grosso, IAIFI Fellow
Abstract
The NeymanPearson strategy for hypothesis testing can be employed for goodnessoffit if the alternative hypothesis is selected from data by exploring a rich parametrised family of models. The New Physics Learning Machine (NPLM) methodology has been developed as a concrete implementation of this idea, to target the detection of new physical effects in multidimensional and unbinned collider data. The applications of the NeymanPearson test as a goodnessoffit method extend beyond new physics discovery, to problems of data quality monitoring and, crucially, generative models validation. In this talk I will discuss the main challenges behind the practical use of the NeymanPearson strategy in real setups, such as model selection, uncertainty quantification and scalability, and I will present recent solutions and future prospect to tackle them.11:0011:30 am ET
Break
11:30 am–12:15 pm ET
Generative AI and the natural sciences: Governance strategies and historical perspectives
David Kaiser, MIT
Abstract
Generative AI techniques offer many exciting opportunities for researchers across the natural sciences and beyond. Like any new technologies, however, these tools can also lead to unanticipated problems. Therefore it is imperative to identify — and work to avoid or ameliorate — potential harms. Doing so requires coordination among the research community as well as with individuals and groups who are not themselves scientists. Recent history provides several examples of how oncenew technologies have been managed by wideranging constituencies to advance the greater good. This talk will conclude by describing guidance for protecting scientific integrity in an age of generative AI, which was recently developed by a working group of the US National Academy of Sciences.12:15–1:30 pm ET
Lunch
1:30–2:15 pm ET
Compiling Learning onto Physical Systems
Dirk Englund, MIT
Abstract
The hardware limitations of conventional electronics in deep learning applications have spurred exploration into physical architectures fundamentally different from today’s computers. This talk covers the scalability and performance metrics—such as throughput, energy consumption, and latency—of emerging optical and optoelectronic architectures, with a focus on recently developed hardware error correction techniques, insitu training methods and initial field trials, as well as methods leveraging quantum information science to perform learning and inference in ways not currently possible.2:15–3:00 pm ET
MLbased modeling and control to enable new capabilities in beam customization and control at particle accelerator scientific user facilities
Auralee Edelen, SLAC
Abstract
Particle accelerators are incredibly complicated machines that are used for numerous applications in science, industry, and medicine. At scientific user facilities driven by particle accelerators, it is often the case that custom particle beams must be generated on demand. Simultaneously, increasingly tight tolerances and difficulttoachieve beam characteristics are needed to meet the needs of future applications of accelerators and unlock new experimental capabilities. This is a highly complicated, nonlinear control problem that involves precise shaping of the beam in 6D positionmomentum phase space. In this talk I will discuss how ML based modeling and control is beginning to transform how beam control is conducted at accelerator facilities that require highly flexible beam customization. This includes the development of digital twins for accelerator systems, improving accelerator system models using differentiable simulations and other hybrid ML and physics approaches, physicsinformed Bayesian optimization, reinforcement learning, and ML enhanced beam diagnostics. The talk will focus on examples from LCLS, LCLSII, FACETII, and MeVUED at SLAC, and the APS and AWA at Argonne National Lab, all major scientific user facilities.3:00–3:30 pm ET
Closing
Speakers
Accommodations
We have secured discounted rates at the following hotels:

Royal Sonesta Boston, 40 Edwin H Land Blvd, Cambridge, MA 02142.
$224 nightly rate (12 people per room)
Deadline to book: July 29
Workshop attendees are also welcome to book dorms for a discounted rate at Boston University:

10 Buick Street, Boston
$97.50 nightly rate (1 person per room, shared bathroom with 1 other person)
FAQ
 Who can attend the Summer Workshop? Any researcher working at or interested in the intersection of physics and AI is encouraged to attend the Summer Workshop.
 What is the cost to attend the Summer Worskhop? The registration fee for the Summer Workshop is 200 USD and includes a welcome dinner, as well as coffee breaks and snacks.
 If I come to the Summer School, can I also attend the Workshop? Yes! We encourage you to stay for the Workshop and you can stay in the dorms for both events if you choose (at your expense).
 Will the recordings of the talks be available? We plan to share the talks on our YouTube channel.
2024 Organizing Committee
 Fabian Ruehle, Chair (Northeastern University)
 Demba Ba (Harvard)
 Alex Gagliano (IAIFI Fellow)
 Di Luo (IAIFI Fellow)
 Polina Abratenko (Tufts)
 Owen Dugan (MIT)
 Sneh Pandya (Northeastern)
 Yidi Qi (Northeastern)
 Manos Theodosis (Harvard)
 Sokratis Trifinopoulos (MIT)