 Upcoming Journal Clubs
 Fall 2022 Journal Clubs
 Spring 2022 Journal Clubs
 Fall 2021 Journal Clubs
 Spring 2021 Journal Clubs
 Fall 2020 Journal Clubs
Upcoming Journal Clubs
The IAIFI Journal Club is only open to IAIFI members and affiliates.
Our Journal Club will continue in Spring 2023.
Fall 2022 Journal Clubs
 Anna Golubeva, IAIFI Fellow and Matt Schwartz, Professor, Harvard
 November 29, 2022, 11:00am12:00pm
 Should artificial intelligence be interpretable to humans?
 Resource:
 Michael Toomey, PhD Student, Brown University
 November 15, 2022, 11:00am12:00pm
 Deep Learning the Dark Sector
 Abstract: One of the most pressing questions in physics today is the microphysical origin of dark matter. While there have been numerous experimental programs aimed at detecting its interactions with the Standard Model, all efforts todate have come up empty. An alternative method to constrain dark matter is purely based on its gravitational interactions. In particular, gravitational lensing can be very sensitive to the distribution and morphology of dark matter substructure which can vary appreciably between different models. However, the complexity of data sets, systematics, and large volumes of data make the dimensionality of this problem difficult to approach from more traditional methods. Thankfully, this is a task ideally suited for machine learning. In this talk we will demonstrate how machine learning will play a critical role in distinguishing between models of dark matter and constraining model parameters in lensing data. We will additionally discuss techniques unique to ML for transferring the knowledge accumulated by models in the controlled setting of simulation to real data sets utilizing unsupervised domain adaptation.
 Slides (For IAIFI members only)
 Ziming Liu, PhD Student, MIT
 November 8, 2022, 11:00am12:00pm
 Toy Models of Superposition
 Abstract: It would be very convenient if the individual neurons of artificial neural networks corresponded to cleanly interpretable features of the input. For example, in an “ideal” ImageNet classifier, each neuron would fire only in the presence of a specific visual feature, such as the color red, a leftfacing curve, or a dog snout. But it isn’t always the case that features correspond so cleanly to neurons, especially in large language models where it actually seems rare for neurons to correspond to clean features. I will present a recent paper “Toy Models of Superposition” from Anthropic, aiming to answer these questions: Why is it that neurons sometimes align with features and sometimes don’t? Why do some models and tasks have many of these clean neurons, while they’re vanishingly rare in others?
 Slides (For IAIFI members only)
 Sona Najafi, Researcher, IBM
 October 25, 2022, 11:00am12:00pm
 Quantum machine learning from algorithms to hardware
 Abstract: The rapid progress of technology over the past few decades has led to the emergence of two powerful computational paradigms known as quantum computing and machine learning. While machine learning tries to learn the solutions from data, quantum computing harnesses the quantum laws for more powerful computation compared to classical computers. In this talk, I will discuss three domains of quantum machine learning, each harnessing a particular aspect of quantum computers and targeting specific problems. The first domain scrutinizes the power of quantum computers to work with highdimensional data and speedup algebra, but raises the caveat of input/output due to the quantum measurement rules. The second domain circumvents this problem by using a hybrid architecture, performing optimization on a classical computer while evaluating parameterized states on a quantum circuit, chosen based on a particular issue. Finally, the third domain is inspired by brainlike computation and uses a given quantum system’s natural interaction and unitary dynamic as a source for learning
 Kim Nicoli, Grad Student, Technical University of Berlin
 October 18, 2022, 11:00am12:00pm
 Deep Learning approaches in lattice quantum field theory: recent advances and future challenges**
 Abstract: Normalizing flows are deep generative models that leverage the change of variable formula to map simple base densities to arbitrary complex target distributions. Recent works have shown the potential of such methods in learning normalized Boltzmann densities in many fields ranging from condensed matter physics to molecular science to lattice field theory. Though sampling from a flowbased density comes with many advantages over standard MCMC sampling, it is known that these methods still suffer from several limitations. In my talk, I will start to give an overview on how to deploy deep generative models to learn Boltzmann densities in the context of a phi^4 lattice field theory. Specifically, I’ll focus on how these methods open up the possibility to estimate thermodynamic observables, i.e., physical observables which depend on the partition function and hence are not straightforward to estimate using standard MCMC methods. In the second part of my talk, I will present two ideas that have been proposed to mitigate the wellknown problem of modecollapse which often occurs when normalizing flows are trained to learn a multimodal target density. More specifically I’ll talk about a novel “modedropping estimator” and path gradients. In the last part of my talk, I’ll present a new idea which aims at using flowbased methods to mitigate the sign problem.
 Slides (For IAIFI members only)
 Adriana Dropulic, Grad Student, Princeton
 October 4, 2022, 11:00am12:00pm
 Machine Learning the 6th Dimension: Stellar Radial Velocities from 5D PhaseSpace Correlations
 Abstract: The Gaia satellite will observe the positions and velocities of over a billion Milky Way stars. In the early data releases, most observed stars do not have complete 6D phasespace information. We demonstrate the ability to infer the missing lineofsight velocities until more spectroscopic observations become available. We utilize a novel neural network architecture that, after being trained on a subset of data with complete phasespace information, takes in a star’s 5D astrometry (angular coordinates, proper motions, and parallax) and outputs a predicted lineofsight velocity with an associated uncertainty. Working with a mock Gaia catalog, we show that the network can successfully recover the distributions and correlations of each velocity component for stars that fall within ~5 kpc of the Sun. We also demonstrate that the network can accurately reconstruct the velocity distribution of a kinematic substructure in the stellar halo that is spatially uniform, even when it comprises a small fraction of the total star count. We apply the neural network to real Gaia data and discuss how the inferred information augments our understanding of the Milky Way’s formation history.
 Slides (For IAIFI members only)
 Iris Cong, Grad Student, Harvard
 September 27, 2022, 11:00am12:00pm
 Quantum Convolutional Neural Networks
 Abstract: Convolutional neural networks (CNNs) have recently proven successful for many complex applications ranging from image recognition to precision medicine. In the first part of my talk, motivated by recent advances in realizing quantum information processors, I introduce and analyze a quantum circuitbased algorithm inspired by CNNs. Our quantum convolutional neural network (QCNN) uses only O(log(N)) variational parameters for input sizes of N qubits, allowing for its efficient training and implementation on realistic, nearterm quantum devices. To explicitly illustrate its capabilities, I show that QCNN can accurately recognize quantum states associated with a onedimensional symmetryprotected topological phase, with performance surpassing existing approaches. I further demonstrate that QCNN can be used to devise a quantum error correction (QEC) scheme optimized for a given, unknown error model that substantially outperforms known quantum codes of comparable complexity. The design of such error correction codes is particularly important for nearterm experiments, whose error models may be different from those addressed by generalpurpose QEC schemes. If time permits, I will also present our latest results on generalizing the QCNN framework to more accurately and efficiently identify twodimensional topological phases of matter.
 Slides (For IAIFI members only)
 Miles Cranmer, Grad Student, Princeton
 September 20, 2022, 11:00am–12:00pm
 Interpretable Machine Learning for Physics
 Abstract: Would Kepler have discovered his laws if machine learning had been around in 1609? Or would he have been satisfied with the accuracy of some black box regression model, leaving Newton without the inspiration to find the law of gravitation? In this talk I will present a review of some industryoriented machine learning algorithms, and discuss a major issue facing their use in the natural sciences: a lack of interpretability. I will then outline several approaches I have created with collaborators to help address these problems, based largely on a mix of structured deep learning and symbolic methods. This will include an introduction to the PySR software (https://astroautomata.com/PySR), a Python/Julia package for highperformance symbolic regression. I will conclude by demonstrating applications of such techniques and how we may gain new insights from such results.
 Resources: https://arxiv.org/abs/2207.12409; https://arxiv.org/abs/2202.02306; https://arxiv.org/abs/2006.11287
 Slides (For IAIFI members only)
 Anindita Maiti, Grad Student, Northeastern
 September 13, 2022, 11:00am12:00pm
 A Study of Neural Network Field Theories
 Abstract: I will present a systematic exploration of field theories arising in Neural Networks, using a dual framework given by Neural Network parameters. The infinite width limit of NN architectures, combined with i.i.d. parameters, lead to Gaussian Processes in Neural Networks by the Central Limit Theorem (CLT), corresponding to generalized free field theories. Small and large violations of the CLT respectively lead to weakly coupled and nonperturbative nonLagrangian field theories in Neural Networks. NonGaussianity, locality (via cluster decomposition), and symmetries of Neural Network field theories are examined via NN parameter space, without necessitating the knowledge of field theoretic actions. Thus, Neural Network field theories, in conjunction to this duality via parameters, may have potential implications for Physics and Machine Learning both.
 Resources: https://arxiv.org/abs/2106.00694
 Slides (For IAIFI members only)
Spring 2022 Journal Clubs
 Jessie Micallef, PhD Student, Michigan State University & Incoming IAIFI Fellow
 March 10, 2022, 11:00am12:00pm
 “Adapting CNNs to Reconstruct Sparse, GeVScale IceCube Neutrino Events”
 Resources:
 Slides (For IAIFI members only)
 Denis Boyda, Postdoctoral Appointee, Argonne National Laboratory & Incoming IAIFI Fellow
 RESCHEDULED: March 17, 2022, 11:00am12:00pm
 “Overview of some popular Machine Learning frameworks for data parallelism”
 Resources:
 S. Li et. al. PyTorch Distributed: Experiences on Accelerating Data Parallel Training. 2020. arXiv:2006.15704
 A. Sergeev and Mike Del Balso. Horovod: fast and easy distributed deep learning in TensorFlow. 2018. arXiv:1802.05799
 S. Rajbhandari et.al. ZeRO: Memory Optimizations Toward Training Trillion Parameter Models. 2020. arXiv:1910.02054
 Slides (For IAIFI members only)
 Yin Lin, Postdoctoral Researcher, MIT
 April 7, 2022, 11:00am12:00pm
 “Accelerating Dirac equation solves in lattice QFT with neuralnetwork preconditioners”
 Resources:
 Slides (For IAIFI members only)
 Anatoly Dymarsky, Associate Professor, University of Kentucky
 April 14, 2022, 11:00am12:00pm
 Tensor network to learn the wave function of data
 Abstract: We use tensor networkbased architecture to train a network which simultaneously accomplishes two tasks: image classification and image sampling. We argue that simultaneous performance of these tasks means our network has successfully learned the whole “manifold of data” (using the terminology from the literature)  namely all possible images of a particular kind. We use a black and white version of MNIST, hence our network learns all possible images depicting a particular digit. We access global properties of the “manifold of data” by calculating its size. Thus, we found there are 2^72 possible images of digit 3. We explain this number is robust and largely independent of the details of training process etc.
 Resources:
 Slides (For IAIFI members only)
 Carolina Cuesta, PhD Student, Durham University & Incoming IAIFI Fellow
 April 21, 2022, 11:00am12:00pm
 Equivariant normalizing flows and their application to cosmology
 Resources:
 Slides (For IAIFI members only)
 Benjamin Fuks, Professor, Sorbonne University
 April 28, 2022, 11:00am12:00pm
 Precision simulations for new physics
 Resources:
 https://arxiv.org/abs/1907.04898
 [https://arxiv.org/abs/1901.09937](How precision allows us to design new variables to look for signals (Phys. Rev. D 100, 074010 (2019))
 https://arxiv.org/abs/2109.11815
 Dylan Hadfield, Assistant Professor, MIT
 May 5, 2022, 11:00am12:00pm
 Overoptimization, Incompleteness, and Goodhart’s Law
 Resources:
 Mark Hamilton, Graduate Student, MIT
 Manami Kanemura, Undergraduate Student, Northeastern University (completed coop with Bryan Ostdiek)
 May 26, 2022, 11:00am12:00pm
 Using SoftIntrospection to improve anomaly detection at LHC
 Resources:
 Slides (For IAIFI members only)
Fall 2021 Journal Clubs
 Michael Douglas
 Thursday, September 23, 2021, 11:00am12:00pm
 “Solving Combinatorial Problems using AI/ML”
 Abstract/Resources: Bright et al 1907.04408; Heule et al 1905.10192; Halverson et al 1903.11616; McAleer et al 1805.07470; Gukov et al 2010.16263; General sources on reinforcement learning: Sutton and Bardo, The MathCheck SAT+CAS system

Slides (For IAIFI members only)
 Ziming Liu
 Thursday, October 7, 2021, 11:00am12:00pm
 “Dynamics in Modern Deep Learning Models”
 Abstract/Resources: Transient Chaos in BERT; Memory and attention in deep learning; The Brownian motion in the transformer model

Slides (For IAIFI members only)
 Ge Yang
 Thursday, October 21, 2021, 11:00am12:00pm
 “Learning and Generalization: Revisiting Neural Representations”
 Abstract/Resources: Understanding how deep neural networks learn and generalize has been a central pursuit of intelligence research. This is because we want to build agents that can learn quickly from a small amount of data, that also generalizes to a wider set of scenarios. In this talk, we take a systems approach by identifying key bottleneck components that limits learning and generalization. We will present two key results — overcoming the simplicity bias of neural value approximation via random Fourier features and going beyond the training distribution via invariance through inference.
 Eric Michaud, PhD Student, MIT
 Thursday, November 18, 2021 11:00am12:00pm
 “Curious Properties of Neural Networks”
 Abstract/Resources: In this informal talk/discussion, I will highlight some facts about neural networks which I find to be particularly fun and surprising. Possible topics could include the Lottery Ticket Hypothesis (https://arxiv.org/abs/1803.03635), Double Descent (https://arxiv.org/abs/1912.02292), and “grokking” (https://mathaiiclr.github.io/papers/papers/MATHAI_29_paper.pdf). There will be time for discussion and for attendees to bring up their own favorite surprising facts about deep learning.
 Murphy Niu, Google Quantum AI
 Thursday, December 3, 11:00am12:00pm
 “Entangling Quantum Generative Adversarial Networks using Tensorflow Quantum”
 Abstract/Resources: https://arxiv.org/pdf/2105.00080.pdf; https://arxiv.org/pdf/2003.02989.pdf%20%20Page%202.pdf
Spring 2021 Journal Clubs
 Anindita Maiti
 Wednesday, February 17
 “Neural Networks and Quantum Field Theory”
 Abstract/Resources: https://arxiv.org/abs/2008.08601
 Jacob ZavatoneVeth
 Tuesday, March 2
 “NonGaussian Processes and Neural Networks at Finite Widths”
 Abstract/Resources: https://arxiv.org/abs/1910.00019
 Di Luo
 Tuesday, April 6
 “Simulating Quantum ManyBody Physics with Neural Network Representation”
 Abstract/Resources: https://arxiv.org/abs/1807.10770; https://arxiv.org/pdf/1912.11052.pdf; https://arxiv.org/abs/2012.05232
 Anna Golubeva
 Tuesday, April 27
 “Are Wider Nets Better Given the Same Number of Parameters?”
 Abstract/Resources: https://arxiv.org/abs/2010.14495
 Siddharth MishraSharma
 Tuesday, May 11
 SimulationBased Inference Focusing on Astrophysical Applications
 Abstract/Resources: https://arxiv.org/abs/1911.01429; https://arxiv.org/abs/1909.02005
Fall 2020 Journal Clubs
 Bhairav Mehta
 Tuesday, October 20
 “Learning Invariances”
 Abstract/Resources: https://arxiv.org/abs/2009.00329
 Andrew Tan
 Wednesday, November 4
 “Estimating Mutual Information”
 Abstract/Resources: https://arxiv.org/abs/1905.06922
 Ziming Liu
 Wednesday, November 18
 “Scaling Laws of Learning”
 Abstract/Resources: https://arxiv.org/abs/2010.14701; https://arxiv.org/abs/2004.10802; https://arxiv.org/abs/2001.08361
 Dan Roberts
 Wednesday, December 2
 “Effective Theory of Deep Learning”