All sessions will take place at the Joyce Cummings Center, Tufts University. All times listed are ET.
Monday, August 8, 2022
Breakfast is served
Welcome and Introduction from Jesse Thaler, IAIFI Director
Sébastien Racanière, Staff Research Engineer, DeepMind
Generative models with symmetries for physics
Recently, there have been some very impressive advances in generative models for sound, text and images. In this talk, I will look into applications of generative models to physics, in particular Atomic Solids and Lattice QCD. The models I will consider are flows, which are families of diffeomorphisms transforming simple base distributions into complicated target distributions. I will show some advantages of these models over other approaches, and explain how known symmetries of those problems can be incorporated into the flows.
Sébastien Racanière originally studied pure mathematics. His PhD was in the field of differential geometry, studying the cohomology of moduli spaces that appear in Yang-Mills theory. For the last seven years, he has been working as an ML researcher in DeepMind. He mostly worked on generative models, either to use them (Reinforcement Learning or in the natural sciences), develop new ones, or understand their optimisation.
Claudius Krause, Postdoctoral Associate, Rutgers University
Normalizing Flows at the LHC
Normalizing Flows — invertible neural networks — provide a versatile class of ML models that have seen many applications to high-energy physics in the past years, covering the full analysis pipeline of the LHC from data to physics and back. In my talk, I will highlight how they can be used for better phase space integration, faster detector simulation, and more efficient bump-hunt searches. Preparing for the vast amount of LHC data coming in the next years, these techniques will be crucial to fully understand all data based on first principles.
Claudius Krause graduated from LMU in Munich on Higgs Effective Field Theories in 2016. While being a postdoc at Fermilab in 2019, he started working on normalizing flows and their application to particle physics. He is currently postdoc at Rutgers University and has applied normalizing flows to various different problems in high-energy physics.
Phil Harris, Assistant Professor of Physics, MIT
Learning Physics in the Latent Space
With no new physics discovery since the Higgs boson, there is an effort to redevise how we search and understand physics data at the Large Hadron Collider(LHC). Using the LHC, we present a variety of new deep learning ideas that are going a step further in how we understand and process data at the LHC. Building on ideas from deep learning self-supervision, we show how physics datasets can be self-assembled into physically meaningful spaces that can enable us to approach collider data analyses at a new angle. We demonstrate this approach with Higgs and New physics particles that decay into quarks in the context of improving supervised learning, anomaly detection, and finally, characterizing the nature of the physics we are probing. Finally, we comment on how these approaches can construct a new generation of high-quality physics measurements to be pursued with future data at the LHC.
Greg Yang, Senior Researcher, Microsoft Research
The unreasonable effectiveness of mathematics in large scale deep learning
Recently, the theory of infinite-width neural networks led to the first technology, muTransfer, for tuning enormous neural networks that are too expensive to train more than once. For example, this allowed us to tune the 6.7 billion parameter version of GPT-3 using only 7% of its pretraining compute budget, and with some asterisks, we get a performance comparable to the original GPT-3 model with twice the parameter count. In this talk, I will explain the core insight behind this theory. In fact, this is an instance of what I call the Optimal Scaling Thesis, which connects infinite-size limits for general notions of “size” to the optimal design of large models in practice, illustrating a way for theory to reliably guide the future of AI. I’ll end with several concrete key mathematical research questions whose resolutions will have incredible impact on how practitioners scale up their NNs.
Greg Yang is a researcher at Microsoft Research in Redmond, Washington. He joined MSR after he obtained Bachelor’s in Mathematics and Master’s degrees in Computer Science from Harvard University, respectively advised by ST Yau and Alexander Rush. He won the Hoopes prize at Harvard for best undergraduate thesis as well as Honorable Mention for the AMS-MAA-SIAM Morgan Prize, the highest honor in the world for an undergraduate in mathematics. He gave an invited talk at the International Congress of Chinese Mathematicians 2019.
Kazuhiro Terao, Staff Scientist, Stanford University
Details to come
Cora Dvorkin, Associate Professor, Harvard University
Mining Cosmological Data: Looking for Physics Beyond the Standard Model
Dr. Cora Dvorkin is an Associate Professor in the Department of Physics at Harvard. She is a theoretical cosmologist. Her areas of research are: the nature of dark matter, neutrinos and other light relics, and the physics of the early universe. She uses observables such as the Cosmic Microwave Background (CMB), the large-scale structure of the universe, and strong gravitational lensing to shed light on these questions.
Virtual networking using Remotely Green
Tuesday, August 9, 2022
Breakfast is served
Day 2 Welcome from Jesse Thaler, IAIFI Director
Fabian Ruehle, Assistant Professor, Northeastern University
Machine Learning for formal theory
Modern machine learning techniques are extremely powerful, but stochastic (and hence error-prone) and often black-box models. In formal theory, we work with mathematical data, which is exact, and we are interested in exact (ideally analytic or explicit) expressions or relations among this data. I will provide an overview of techniques that have been used to obtain exact, provable results from ML. One way is to use whitebox models (such as decision trees). A second is to apply attribution techniques or analytic regression to blackbox models (often neural networks) to obtain conjectures that can be proved rigorously. Third, Reinforcement Learning can generate action sequences from which verifiable truth certificates for properties of objects in the state space of RL can be obtained.
Fabian Ruehle did his undergrad in Physics and Computer Science at the University of Heidelberg. After graduating from Bonn University, he went on to work at DESY Hamburg, the University of Oxford, and CERN. In September 2021, he joined the Physics and Mathematics department at Northeastern University, and IAIFI. His research interests lie at the intersection of Theoretical High Energy Physics, Mathematics, and Machine Learning.
Jennifer Ngadiuba, Wilson Fellow, Fermilab
Boosting sensitivity to new physics at the LHC with anomaly detection
Anomaly detection techniques have been proposed as a way to mitigate the impact of model-specific assumptions when searching for new physics at the LHC. In this talk I will discuss how these techniques, when based on modern AI developments, could be utilized at different stages of data processing workflow, from online to offline analysis, and the impact they could have to revolutionize the current paradigms in the search for new physics.
Jennifer is Associate Scientist at Fermilab with Wilson Fellowship working on applications of AI to high-energy physics problems while being member of the CMS collaboration. She received her PhD at the University of Zurich working on CMS data analyses and inner silicon pixel subsystem. During the following years as postdoctoral fellow at CERN and then Caltech she contributed to the CMS trigger system for which she develops fast AI methods. Since this year she is one of the L2 coordinators of the CMS ML group.
Siamak Ravanbakhsh, Assistant Professor, School of Computer Science, McGill University
Learning with Unknown and Nonlinear Symmetry Transformations
Learning representations that are “aware” of data transformations has proved to be a helpful bias, leading to better generalization and sample efficiency. In recent years, the focus of learning such equivariant and invariant representations has been on linear transformations of the input. These attempts have been very successful in dealing with permutation groups that appear as the symmetry of discrete structures such as sets and graphs and continuous lie groups transforming the data through known linear actions such as translations and rotations. However, in many interesting applications, data transformations are highly nonlinear and a priori unknown to us. In this talk, I will present our recent works on equivariant representation learning in this direction and show applications in reinforcement learning.
Siamak’s research area is machine learning. He is broadly interested in the problem of representation learning and inference in structured, complex and combinatorial domains. In addition to its potential role in artificial general intelligence, our ability to draw inference with structured data is essential in a data-driven approach to science. His recent works explore symmetry transformations in deep learning. This is succinctly motivated by Hermann Weyl’s guiding principle: Whenever you have to do with a structure-endowed entity, try to determine […] those transformations which leave all structural relations undisturbed.
Yi-Zhuang You, Assistant Professor, University of California, San Diego
Machine Learning Renormalization Group and Its Applications
In this talk, I will introduce the machine learning renormalization group method, a hierarchical flow-based generative model motivated by the idea of the renormalization group in physics. Given the action of a field theory, the algorithm learns the optimal renormalization group transformation and maps the field configuration from the holographic boundary to the bulk, which enables efficient sampling and error correction. Beyond physics applications, I will also demonstrate the application of this method in the image and language processing domain.
Yi-Zhuang You is an assistant professor at the University of California, San Diego. His research interests focus on condensed matter theory, quantum information, and machine learning in physics.
Anna Golubeva, IAIFI Fellow
Understanding and Improving Sparse Neural Network Training
Sparsity and neural-network pruning have become indispensable tools in applied machine learning to alleviate the computational demands of ever larger models. While the number of empirical works in this field has exploded in recent years, bringing out a variety of pruning techniques, finding sparse solutions at initialization remains a challenge. Moreover, a theoretical understanding of the very existence of sparse solutions in neural networks is lacking. In this talk, I will discuss the most interesting open questions in this field and present some of our recent work combining theoretical and experimental approaches to tackle them.
Anna is currently a postdoctoral fellow at IAIFI, working on developing a theoretical foundation of deep learning with methods from statistical physics. She obtained her PhD in 2021 at the Perimeter Institute for Theoretical Physics and the University of Waterloo, where she was advised by Roger Melko. During her PhD, she was also a graduate affiliate at the Vector Institute for AI in Toronto. Previously, she completed the Perimeter Scholars International master’s program (2017), a MSc in Theoretical Physics with focus on computational approaches to quantum many-body systems (2016), and a BSc in Biophysics (2014) at the Goethe University in Frankfurt, Germany.
Shuchin Aeron, Associate Professor, Tufts University
Towards learning generative models for high energy physics
In this talk we will focus on learning generative models for generating LArTPC images. In this context we will present two approaches that we employed recently to this problem - a. Variational Quantized Variational Auto-Encoder (VQ-VAE) and b. Score based diffusion models. We will present the architecture and the methodology behind these two methods and comment on the results obtained both qualitatively and quantitatively (using SSNet). Finally we will comment on the potential utility of these generative models for neutrino physics.
Shuchin Aeron is an associate professor in the Department of Electrical and Computer Engineering at Tufts School of Engineering. He received his Ph.D. from Boston University in 2009. From 2009-2022 he was a postdoctoral research fellow at Schlumberger-Doll Research (SDR), where he worked on signal processing solution products for borehole acoustics. In 2016, he received an NSF CAREER award. He was a visiting faculty at Mitsubishi Electric Research Labs (MERL) in 2019. Aeron is currently a senior member of the Institute of Electrical and Electronics Engineers (IEEE). His research interests are in statistical signal processing, information theory, tensor data analytics, and optimal transport.
Closing remarks from Jesse Thaler, IAIFI Director