IAIFI Theoretical Physics Papers

View high energy physics IAIFI papers on INSPIRE

Theoretical Physics

Not So Flat Metrics
Cristofero S. Fraser-Taliente, Thomas R. Harvey, Manki Kim
[ arXiv:2411.00962 ]

Abstract In order to be in control of the α′ derivative expansion, geometric string compactifications are understood in the context of a large volume approximation. In this letter, we consider the reduction of these higher derivative terms, and propose an improved estimate on the large volume approximation using numerical Calabi-Yau metrics obtained via machine learning methods. Further to this, we consider the α′3 corrections to numerical Calabi-Yau metrics in the context of IIB string theory. This correction represents one of several important contributions for realistic string compactifications -- alongside, for example, the backreaction of fluxes and local sources -- all of which have important consequences for string phenomenology. As a simple application of the corrected metric, we compute the change to the spectrum of the scalar Laplacian.

Fermion Masses and Mixing in String-Inspired Models
Andrei Constantin, Cristofero S. Fraser-Taliente, Thomas R. Harvey, Lucas T. Y. Leung, Andre Lukas
[ arXiv:2410.17704 ]

Abstract We study a class of supersymmetric Froggatt-Nielsen (FN) models with multiple U(1) symmetries and Standard Model (SM) singlets inspired by heterotic string compactifications on Calabi-Yau threefolds. The string-theoretic origin imposes a particular charge pattern on the SM fields and FN singlets, dividing the latter into perturbative and non-perturbative types. Employing systematic and heuristic search strategies, such as genetic algorithms, we identify charge assignments and singlet VEVs that replicate the observed mass and mixing hierarchies in the quark sector, and subsequently refine the Yukawa matrix coefficients to accurately match the observed values for the Higgs VEV, the quark and charged lepton masses and the CKM matrix. This bottom-up approach complements top-down string constructions and our results demonstrate that string FN models possess a sufficiently rich structure to account for flavour physics. On the other hand, the limited number of distinct viable charge patterns identified here indicates that flavour physics imposes tight constraints on string theory models, adding new constraints on particle spectra that are essential for achieving a realistic phenomenology.

SPECTER: Efficient Evaluation of the Spectral EMD
Rikab Gambhir, Andrew J. Larkoski, Jesse Thaler
[ arXiv:2410.05379 | code ]

Abstract The Energy Mover's Distance (EMD) has seen use in collider physics as a metric between events and as a geometric method of defining infrared and collinear safe observables. Recently, the Spectral Energy Mover's Distance (SEMD) has been proposed as a more analytically tractable alternative to the EMD. In this work, we obtain a closed-form expression for the Riemannian-like p = 2 SEMD metric between events, eliminating the need to numerically solve an optimal transport problem. Additionally, we show how the SEMD can be used to define event and jet shape observables by minimizing the distance between events and parameterized energy flows (similar to the EMD), and we obtain closed-form expressions for several of these observables. We also present the SPECTER framework, an efficient and highly parallelized implementation of the SEMD metric and SEMD-derived shape observables as an analogue of the previously-introduced SHAPER for EMD-based computations. We demonstrate that computing the SEMD with SPECTER can be up to a thousand times faster than computing the EMD with standard optimal transport libraries.

Exploring gauge-fixing conditions with gradient-based optimization
William Detmold, Gurtej Kanwar, Yin Lin, Phiala E. Shanahan, Michael L. Wagman
[ arXiv:2410.03602 ]

Abstract Lattice gauge fixing is required to compute gauge-variant quantities, for example those used in RI-MOM renormalization schemes or as objects of comparison for model calculations. Recently, gauge-variant quantities have also been found to be more amenable to signal-to-noise optimization using contour deformations. These applications motivate systematic parameterization and exploration of gauge-fixing schemes. This work introduces a differentiable parameterization of gauge fixing which is broad enough to cover Landau gauge, Coulomb gauge, and maximal tree gauges. The adjoint state method allows gradient-based optimization to select gauge-fixing schemes that minimize an arbitrary target loss function.

A Field Guide to Event-Shape Observables Using Optimal Transport
Cari Cesarotti, Matt LeBlanc
[ arXiv:2409.13150 ]

Abstract We lay out the phenomenological behavior of event-shape observables evaluated by solving optimal transport problems between collider events and reference geometries -- which we name 'manifold distances' -- to provide guidance regarding their use in future studies. This discussion considers several choices related to the metric used to quantify these distances. We explore the differences between the various options, using a combination of analytical studies and simulated minimum-bias and multi-jet events. Making judicious choices when defining the metric and reference geometry can improve sensitivity to interesting signal features and reduce sensitivity to non-perturbative effects in QCD. The goal of this article is to provide a 'field guide' that can inform how choices made when defining a manifold distance can be tailored for the analysis at-hand.

Conformal Fields from Neural Networks
James Halverson, Joydeep Naskar, Jiahua Tian
[ arXiv:2409.12222 ]

Abstract We use the embedding formalism to construct conformal fields in D dimensions, by restricting Lorentz-invariant ensembles of homogeneous neural networks in (D+2) dimensions to the projective null cone. Conformal correlators may be computed using the parameter space description of the neural network. Exact four-point correlators are computed in a number of examples, and we perform a 4D conformal block decomposition that elucidates the spectrum. In some examples the analysis is facilitated by recent approaches to Feynman integrals. Generalized free CFTs are constructed using the infinite-width Gaussian process limit of the neural network, enabling a realization of the free boson. The extension to deep networks constructs conformal fields at each subsequent layer, with recursion relations relating their conformal dimensions and four-point functions. Numerical approaches are discussed.

Learning the Simplicity of Scattering Amplitudes
Clifford Cheung, Aurélien Dersy, Matthew D. Schwartz
[ arXiv:2408.04720 | code ]

Abstract The simplification and reorganization of complex expressions lies at the core of scientific progress, particularly in theoretical high-energy physics. This work explores the application of machine learning to a particular facet of this challenge: the task of simplifying scattering amplitudes expressed in terms of spinor-helicity variables. We demonstrate that an encoder-decoder transformer architecture achieves impressive simplification capabilities for expressions composed of handfuls of terms. Lengthier expressions are implemented in an additional embedding network, trained using contrastive learning, which isolates subexpressions that are more likely to simplify. The resulting framework is capable of reducing expressions with hundreds of terms - a regular occurrence in quantum field theory calculations - to vastly simpler equivalent expressions. Starting from lengthy input expressions, our networks can generate the Parke-Taylor formula for five-point gluon scattering, as well as new compact expressions for five-point amplitudes involving scalars and gravitons. An interactive demonstration can be found at this https URL.

Attractors, Geodesics, and the Geometry of Moduli Spaces
Fabian Ruehle, Benjamin Sung
[ arXiv:2408.00830 ]

Abstract We connect recent conjectures and observations pertaining to geodesics, attractor flows, Laplacian eigenvalues and the geometry of moduli spaces by using that attractor flows are geodesics. For toroidal compactifications, attractor points are related to (degenerate) masses of the Laplacian on the target space, and also to the Laplacian on the moduli space. We also explore compactifications of M-Theory to 5D on a Calabi-Yau threefold and argue that geodesics are unique in a special set of classes, providing further evidence for a recent conjecture by Raman and Vafa. Finally, we describe the role of the marked moduli space in 4d =2 compactifications. We study split attractor flows in an explicit example of the one-parameter family of quintics and discuss setups where flops to isomorphic Calabi-Yau manifolds exist.

TASI Lectures on Physics for Machine Learning
Jim Halverson
[ arXiv:2408.00082 ]

Abstract These notes are based on lectures I gave at TASI 2024 on Physics for Machine Learning. The focus is on neural network theory, organized according to network expressivity, statistics, and dynamics. I present classic results such as the universal approximation theorem and neural network / Gaussian process correspondence, and also more recent results such as the neural tangent kernel, feature learning with the maximal update parameterization, and Kolmogorov-Arnold networks. The exposition on neural network theory emphasizes a field theoretic perspective familiar to theoretical physicists. I elaborate on connections between the two, including a neural network approach to field theory.

Simulating moiré quantum matter with neural network
Di Luo, David D. Dai, Liang Fu
[ arXiv:2406.17645 ]

Abstract Moiré materials provide an ideal platform for exploring quantum phases of matter. However, solving the many-electron problem in moiré systems is challenging due to strong correlation effects. We introduce a powerful variational representation of quantum states, many-body neural Bloch wavefunction, to solve many-electron problems in moiré materials accurately and efficiently. Applying our method to the semiconductor heterobilayer WSe2/WS2 , we obtain a generalized Wigner crystal at filling factor n = 1/3, a Mott insulator n = 1, and a correlated insulator with local magnetic moments and antiferromagnetic spin correlation at n = 2. Our neural network approach improves the simulation accuracy of strongly interacting moiré materials and paves the way for discovery of new quantum phases with variational learning principle in a unified framework.

Constructing gauge-invariant neural networks for scientific applications
ICML 2024 Workshop AI4Science
Open Review, Submission Number 184 [ | code ]

Abstract Our current models for fundamental forces in nature are “gauge theories”. These models are suitable for systems where interactions are local and where the local choice of coordinates does not affect physical quantities. While recent works have introduced gauge equivariant neural networks, these models focus on tangent bundles or quotient space and are not applicable to most gauge theories appearing in physics. We propose an architecture for learning general gauge invariant quantities. Our framework fills a gap in the existing literature, providing a general recipe for gauge invariance without restrictions on the spaces of the measurement vectors. We evaluate our method on a classical physical system, the XY model, that is invariant to the choice of local gauges.

QCD constraints on isospin-dense matter and the nuclear equation of state
Ryan Abbott, William Detmold, Marc Illa, Assumpta Parreño, Robert J. Perry, Fernando Romero-López, Phiala E. Shanahan, Michael L. Wagman
[ arXiv:2406.09273 ]

Abstract Understanding the behavior of dense hadronic matter is a central goal in nuclear physics as it governs the nature and dynamics of astrophysical objects such as supernovae and neutron stars. Because of the non-perturbative nature of quantum chromodynamics (QCD), little is known rigorously about hadronic matter in these extreme conditions. Here, lattice QCD calculations are used to compute thermodynamic quantities and the equation of state of QCD over a wide range of isospin chemical potentials. Agreement is seen with chiral perturbation theory predictions when the chemical potential is small. Comparison to perturbative QCD calculations at large chemical potential allows for an estimate of the gap in the superconducting phase, and this quantity is seen to agree with perturbative determinations. Since the partition function for an isospin chemical potential, μI, bounds the partition function for a baryon chemical potential μB=3μI/2, these calculations also provide rigorous non-perturbative QCD bounds on the symmetric nuclear matter equation of state over a wide range of baryon densities for the first time.

A Heterotic Kähler Gravity and the Distance Conjecture
Javier José Murgas Ibarra, Paul-Konstantin Oehlmann, Fabian Ruehle, Eirik Eik Svanes
[ arXiv:2406.04393 ]

Abstract Deformations of the heterotic superpotential give rise to a topological holomorphic theory with similarities to both Kodaira-Spencer gravity and holomorphic Chern-Simons theory. Although the action is cubic, it is only quadratic in the complex structure deformations (the Beltrami differential). Treated separately, for large fluxes, or alternatively at large distances in the background complex structure moduli space, these fields can be integrated out to obtain a new field theory in the remaining fields, which describe the complexified hermitian and gauge degrees of freedom. We investigate properties of this new holomorphic theory, and in particular connections to the swampland distance conjecture in the context of heterotic string theory. In the process, we define a new type of symplectic cohomology theory, where the background complex structure Beltrami differential plays the role of the symplectic form.

Stochastic logic in biased coupled photonic probabilistic bits
Michael Horodynski, Charles Roques-Carmes, Yannick Salamin, Seou Choi, Jamison Sloan, Di Luo, Marin Soljačić
[ arXiv:2406.04000 ]

Abstract Optical computing often employs tailor-made hardware to implement specific algorithms, trading generality for improved performance in key aspects like speed and power efficiency. An important computing approach that is still missing its corresponding optical hardware is probabilistic computing, used e.g. for solving difficult combinatorial optimization problems. In this study, we propose an experimentally viable photonic approach to solve arbitrary probabilistic computing problems. Our method relies on the insight that coherent Ising machines composed of coupled and biased optical parametric oscillators can emulate stochastic logic. We demonstrate the feasibility of our approach by using numerical simulations equivalent to the full density matrix formulation of coupled optical parametric oscillators.

QuanTA: Efficient High-Rank Fine-Tuning of LLMs with Quantum-Informed Tensor Adaptation
Zhuo Chen, Rumen Dangovski, Charlotte Loh, Owen Dugan, Di Luo, Marin Soljačić
[ arXiv:2406.00132 | code ]

Abstract We propose Quantum-informed Tensor Adaptation (QuanTA), a novel, easy-to-implement, fine-tuning method with no inference overhead for large-scale pre-trained language models. By leveraging quantum-inspired methods derived from quantum circuit structures, QuanTA enables efficient high-rank fine-tuning, surpassing the limitations of Low-Rank Adaptation (LoRA)--low-rank approximation may fail for complicated downstream tasks. Our approach is theoretically supported by the universality theorem and the rank representation theorem to achieve efficient high-rank adaptations. Experiments demonstrate that QuanTA significantly enhances commonsense reasoning, arithmetic reasoning, and scalability compared to traditional methods. Furthermore, QuanTA shows superior performance with fewer trainable parameters compared to other approaches and can be designed to integrate with existing fine-tuning algorithms for further improvement, providing a scalable and efficient solution for fine-tuning large language models and advancing state-of-the-art in natural language processing.

Harmonic 1-forms on real loci of Calabi-Yau manifolds
Michael R. Douglas, Daniel Platt, Yidi Qi
[ arXiv:2405.19402 | code ]

Abstract We numerically study whether there exist nowhere vanishing harmonic 1-forms on the real locus of some carefully constructed examples of Calabi-Yau manifolds, which would then give rise to potentially new examples of G2-manifolds and an explicit description of their metrics. We do this in two steps: first, we use a neural network to compute an approximate Calabi-Yau metric on each manifold. Second, we use another neural network to compute an approximately harmonic 1-form with respect to the approximate metric, and then inspect the found solution. On two manifolds existence of a nowhere vanishing harmonic 1-form can be ruled out using differential geometry. The real locus of a third manifold is diffeomorphic to S1×S2, and our numerics suggest that when the Calabi-Yau metric is close to a singular limit, then it admits a nowhere vanishing harmonic 1-form. We explain how such an approximate solution could potentially be used in a numerically verified proof for the fact that our example manifold must admit a nowhere vanishing harmonic 1-form.

Position-space renormalization schemes for four-quark operators in HQET
Joshua Lin, William Detmold, Stefan Meinel
Journal of High Energy Physics, Volume 2024, article number 188 [ arXiv:2404.16191 ]

Abstract X-space schemes are gauge-invariant, regulator-independent renormalization schemes that are defined by requiring position-space correlation functions of gauge invariant operators to be equal to their noninteracting values at particular kinematic points. These schemes can be used to nonperturbatively renormalize composite operators in Lattice Quantum Chromodynamics (LQCD), and by computing matching coefficients between the X-space scheme and MSbar in the dimensionally-regulated continuum, matrix elements calculated with LQCD can be converted to MSbar-renormalized matrix elements. Using X-space schemes for Heavy Quark Effective Theory (HQET) operators has the additional benefit that appropriate ratios of position-space correlation functions cancel the power divergent static-quark self-energy of Lattice HQET nonperturbatively. This work presents the O(αS) matching coefficients between X-space renormalized four-quark flavor-nonsinglet HQET operators relevant for the lifetimes of charm- and bottom-hadrons, and four-quark HQET operators relevant for mixing between neutral mesons containing a heavy quark, such as B-Bbar mixing.

Practical applications of machine-learned flows on gauge fields
Ryan Abbott, Michael S. Albergo, Denis Boyda, Daniel C. Hackett, Gurtej Kanwar, Fernando Romero-López, Phiala E. Shanahan, Julian M. Urban
[ arXiv:2404.11674 ]

Abstract Normalizing flows are machine-learned maps between different lattice theories which can be used as components in exact sampling and inference schemes. Ongoing work yields increasingly expressive flows on gauge fields, but it remains an open question how flows can improve lattice QCD at state-of-the-art scales. We discuss and demonstrate two applications of flows in replica exchange (parallel tempering) sampling, aimed at improving topological mixing, which are viable with iterative improvements upon presently available flows.

Multiscale Normalizing Flows for Gauge Theories
Ryan Abbott, Michael S. Albergo, Denis Boyda, Daniel C. Hackett, Gurtej Kanwar, Fernando Romero-López, Phiala E. Shanahan, Julian M. Urban
[ arXiv:2404.10819 ]

Abstract Scale separation is an important physical principle that has previously enabled algorithmic advances such as multigrid solvers. Previous work on normalizing flows has been able to utilize scale separation in the context of scalar field theories, but the principle has been largely unexploited in the context of gauge theories. This work gives an overview of a new method for generating gauge fields using hierarchical normalizing flow models. This method builds gauge fields from the outside in, allowing different parts of the model to focus on different scales of the problem. Numerical results are presented for U(1) and SU(3) gauge theories in 2, 3, and 4 spacetime dimensions.

TENG: Time-Evolving Natural Gradient for Solving PDEs with Deep Neural Net
Zhuo Chen, Jacob McCarran, Esteban Vizcaino, Marin Soljačić, Di Luo
Open Review, Submission Number 10038 [ arXiv:2404.10771 | code ]

Abstract Partial differential equations (PDEs) are instrumental for modeling dynamical systems in science and engineering. The advent of neural networks has initiated a significant shift in tackling these complexities though challenges in accuracy persist, especially for initial value problems. In this paper, we introduce the Time-Evolving Natural Gradient (TENG), generalizing time-dependent variational principles and optimization-based time integration, leveraging natural gradient optimization to obtain high accuracy in neural-network-based PDE solutions. Our comprehensive development includes algorithms like TENG-Euler and its high-order variants, such as TENG-Heun, tailored for enhanced precision and efficiency. TENG's effectiveness is further validated through its performance, surpassing current leading methods and achieving machine precision in step-by-step optimizations across a spectrum of PDEs, including the heat equation, Allen-Cahn equation, and Burgers' equation.

The Frozen Phase of Heterotic F-theory Duality
Paul-Konstantin Oehlmann, Fabian Ruehle, Benjamin Sung
Journal of High Energy Physics, Volume 2024, Article Number 295 [ arXiv:2404.02191 ]

Abstract We study the duality between the Spin(32)/ℤ2 heterotic string without vector structure and F-theory with frozen singularities. We give a complete description in theories with 6d =(1,0) supersymmetry and identify the duals of Spin(32)/ℤ2-instantons on ADE singularities without vector structure in the frozen phase of F-theory using an ansatz introduced by Bhardwaj, Morrison, Tachikawa, and Tomasiello. As a consequence, we obtain a strongly coupled description of orbifold phases of type I string theory without vector structure, substantially expanding the list of known examples of 6d F-theory compactifications with frozen singularities. Supergravity theories can be fused from these instanton theories, in a way that commutes with switching off vector structure, which we use to propose new consistency checks via neutral hypermultiplet counting. Finally, we describe various Higgsings of this duality, and comment on constraints on higher form symmetries.

Moments of Clarity: Streamlining Latent Spaces in Machine Learning using Moment Pooling
Rikab Gambhir, Athis Osathapan, Jesse Thaler
Physical Review D, Volume 110, Issue 7, 1 October 2024 [ arXiv:2403.08854 | code ]

Abstract Many machine learning applications involve learning a latent representation of data, which is often high-dimensional and difficult to directly interpret. In this work, we propose "Moment Pooling", a natural extension of Deep Sets networks which drastically decrease latent space dimensionality of these networks while maintaining or even improving performance. Moment Pooling generalizes the summation in Deep Sets to arbitrary multivariate moments, which enables the model to achieve a much higher effective latent dimensionality for a fixed latent dimension. We demonstrate Moment Pooling on the collider physics task of quark/gluon jet classification by extending Energy Flow Networks (EFNs) to Moment EFNs. We find that Moment EFNs with latent dimensions as small as 1 perform similarly to ordinary EFNs with higher latent dimension. This small latent dimension allows for the internal representation to be directly visualized and interpreted, which in turn enables the learned internal jet representation to be extracted in closed form.

On classical de Sitter solutions and parametric control
David Andriot, Fabian Ruehle
[ arXiv:2403.07065 ]

Abstract Finding string backgrounds with de Sitter spacetime, where all approximations and corrections are controlled, is an open problem. We revisit the search for de Sitter solutions in the classical regime for specific type IIB supergravity compactifications on group manifolds, an under-explored corner of the landscape that offers an interesting testing ground for swampland conjectures. While the supergravity de Sitter solutions we obtain numerically are ambiguous in terms of their classicality, we find an analytic scaling that makes four out of six compactification radii, as well as the overall volume, arbitrarily large. This potentially provides parametric control over corrections. If we could show that these solutions, or others to be found, are fully classical, they would constitute a counterexample to conjectures stating that asymptotic de Sitter solutions do not exist. We discuss this point in great detail.

Photonic probabilistic machine learning using quantum vacuum noise
Seou Choi, Yannick Salamin, Charles Roques-Carmes, Rumen Dangovski, Di Luo, Zhuo Chen, Michael Horodynski, Jamison Sloan, Shiekh Zia Uddin, Marin Soljacic
Nature Communications, 2024, Volume 15, Article number 7760 [ arXiv:2403.04731 ]

Abstract Probabilistic machine learning utilizes controllable sources of randomness to encode uncertainty and enable statistical modeling. Harnessing the pure randomness of quantum vacuum noise, which stems from fluctuating electromagnetic fields, has shown promise for high speed and energy-efficient stochastic photonic elements. Nevertheless, photonic computing hardware which can control these stochastic elements to program probabilistic machine learning algorithms has been limited. Here, we implement a photonic probabilistic computer consisting of a controllable stochastic photonic element - a photonic probabilistic neuron (PPN). Our PPN is implemented in a bistable optical parametric oscillator (OPO) with vacuum-level injected bias fields. We then program a measurement-and-feedback loop for time-multiplexed PPNs with electronic processors (FPGA or GPU) to solve certain probabilistic machine learning tasks. We showcase probabilistic inference and image generation of MNIST-handwritten digits, which are representative examples of discriminative and generative models. In both implementations, quantum vacuum noise is used as a random seed to encode classification uncertainty or probabilistic generation of samples. In addition, we propose a path towards an all-optical probabilistic computing platform, with an estimated sampling rate of ~ 1 Gbps and energy consumption of ~ 5 fJ/MAC. Our work paves the way for scalable, ultrafast, and energy-efficient probabilistic machine learning hardware.

Operator Learning Renormalization Group
Xiu-Zhe Luo, Di Luo, Roger G. Melko
[ arXiv:2403.03199 | code ]

Abstract n this paper, we present a general framework for quantum many-body simulations called the operator learning renormalization group (OLRG). Inspired by machine learning perspectives, OLRG is a generalization of Wilsons numerical renormalization group and Whites density matrix renormalization group, which recursively builds a simulatable system to approximate a target system of the same number of sites via operator maps. OLRG uses a loss function to minimize the error of a target property directly by learning the operator map in lieu of a state ansatz. This loss function is designed by a scaling consistency condition that also provides a provable bound for real-time evolution. We implement two versions of the operator maps for classical and quantum simulations. The former, which we call the Operator Matrix Map, can be implemented via neural networks on classical computers. The latter, which we call the Hamiltonian Expression Map, generates device pulse sequences to leverage the capabilities of quantum computing hardware. We illustrate the performance of both maps for calculating time-dependent quantities in the quantum Ising model Hamiltonian.

Rigor with Machine Learning from Field Theory to the Poincaré Conjecture
Sergei Gukov, James Halverson, Fabian Ruehle
Nature Reviews Physics 2024 [ arXiv:2402.13321 ]

Abstract Machine learning techniques are increasingly powerful, leading to many breakthroughs in the natural sciences, but they are often stochastic, error-prone, and blackbox. How, then, should they be utilized in fields such as theoretical physics and pure mathematics that place a premium on rigor and understanding? In this Perspective we discuss techniques for obtaining rigor in the natural sciences with machine learning. Non-rigorous methods may lead to rigorous results via conjecture generation or verification by reinforcement learning. We survey applications of these techniques-for-rigor ranging from string theory to the smooth 4d Poincaré conjecture in low-dimensional topology. One can also imagine building direct bridges between machine learning theory and either mathematics or theoretical physics. As examples, we describe a new approach to field theory motivated by neural network theory, and a theory of Riemannian metric flows induced by neural network gradient descent, which encompasses Perelmans formulation of the Ricci flow that was utilized to resolve the 3d Poincaré conjecture.

Real-time Dynamics of the Schwinger Model as an Open Quantum System with Neural Density Operators
Joshua Lin, Di Luo, Xiaojun Yao, Phiala E. Shanahan
Journal of High Energy Physics, Volume 2024, Article Number 211 [ arXiv:2402.06607 ]

Abstract Ab-initio simulations of multiple heavy quarks propagating in a Quark-Gluon Plasma are computationally difficult to perform due to the large dimension of the space of density matrices. This work develops machine learning algorithms to overcome this difficulty by approximating exact quantum states with neural network parametrisations, specifically Neural Density Operators. As a proof of principle demonstration in a QCD-like theory, the approach is applied to solve the Lindblad master equation in the 1+1d lattice Schwinger Model as an open quantum system. Neural Density Operators enable the study of in-medium dynamics on large lattice volumes, where multiple-string interactions and their effects on string-breaking and recombination phenomena can be studied. Thermal properties of the system at equilibrium can also be probed with these methods by variationally constructing the steady state of the Lindblad master equation. Scaling of this approach with system size is studied, and numerical demonstrations on up to 32 spatial lattice sites and with up to 3 interacting strings are performed.

Applications of flow models to the generation of correlated lattice QCD ensembles
Ryan Abbott, Aleksandar Botev, Denis Boyda, Daniel C. Hackett, Gurtej Kanwar, Sébastien Racanière, Danilo J. Rezende, Fernando Romero-López, Phiala E. Shanahan, Julian M. Urban
[ arXiv:2401.10874 ]

Abstract Machine-learned normalizing flows can be used in the context of lattice quantum field theory to generate statistically correlated ensembles of lattice gauge fields at different action parameters. This work demonstrates how these correlations can be exploited for variance reduction in the computation of observables. Three different proof-of-concept applications are demonstrated using a novel residual flow architecture: continuum limits of gauge theories, the mass dependence of QCD observables, and hadronic matrix elements based on the Feynman-Hellmann approach. In all three cases, it is shown that statistical uncertainties are significantly reduced when machine-learned flows are incorporated as compared with the same calculations performed with uncorrelated ensembles or direct reweighting.

Anomaly Detection in Collider Physics via Factorized Observables
Eric M. Metodiev, Jesse Thaler, Raymond Wynne
Physical Review D, Volume 110, Issue 5, 1 September 2024 [ arXiv:2312.00119 | code ]

Abstract To maximize the discovery potential of high-energy colliders, experimental searches should be sensitive to unforeseen new physics scenarios. This goal has motivated the use of machine learning for unsupervised anomaly detection. In this paper, we introduce a new anomaly detection strategy called FORCE: factorized observables for regressing conditional expectations. Our approach is based on the inductive bias of factorization, which is the idea that the physics governing different energy scales can be treated as approximately independent. Assuming factorization holds separately for signal and background processes, the appearance of non-trivial correlations between low- and high-energy observables is a robust indicator of new physics. Under the most restrictive form of factorization, a machine-learned model trained to identify such correlations will in fact converge to the optimal new physics classifier. We test FORCE on a benchmark anomaly detection task for the Large Hadron Collider involving collimated sprays of particles called jets. By teasing out correlations between the kinematics and substructure of jets, our method can reliably extract percent-level signal fractions. This strategy for uncovering new physics adds to the growing toolbox of anomaly detection methods for collider physics with a complementary set of assumptions.

Safe but Incalculable: Energy-weighting is not all you need
Samuel Bright-Thonney, Benjamin Nachman, Jesse Thaler
Physical Review D, 2024, Volume 110, Issue 1 [ arXiv:2311.07652 ]

Abstract Infrared and collinear (IRC) safety has long been used a proxy for robustness when developing new jet substructure observables. This guiding philosophy has been carried into the deep learning era, where IRC-safe neural networks have been used for many jet studies. For graph-based neural networks, the most straightforward way to achieve IRC safety is to weight particle inputs by their energies. However, energy-weighting by itself does not guarantee that perturbative calculations of machine-learned observables will enjoy small non-perturbative corrections. In this paper, we demonstrate the sensitivity of IRC-safe networks to non-perturbative effects, by training an energy flow network (EFN) to maximize its sensitivity to hadronization. We then show how to construct Lipschitz Energy Flow Networks (L-EFNs), which are both IRC safe and relatively insensitive to non-perturbative corrections. We demonstrate the performance of L-EFNs on generated samples of quark and gluon jets, and showcase fascinating differences between the learned latent representations of EFNs and L-EFNs.

T-Duality and Flavor Symmetries in Little String Theories
Hamza Ahmed, Paul-Konstantin Oehlmann, Fabian Ruehle
Journal of High Energy Physics, Volume 2024, article number 61 [ arXiv:2311.02168 ]

Abstract We explore the T-duality web of 6D Heterotic Little String Theories, focusing on flavor algebra reducing deformations. A careful analysis of the full flavor algebra, including Abelian factors, shows that the flavor rank is preserved under T-duality. This suggests a new T-duality invariant in addition to the Coulomb branch dimension and the two-group structure constants. We also engineer Little String Theories with non-simply laced flavor algebras, whose appearance we attribute to certain discrete 3-form fluxes in M-theory. Geometrically, these theories are engineered in F-theory with non-Kähler favorable K3 fibers. This geometric origin leads us to propose that freezing fluxes are preserved across T-duality. Along the way, we discuss various exotic models, including two inequivalent Spin(32)/ℤ2 models that are dual to the same E8×E8 theory, and a family of self-T-dual models.

Metric Flows with Neural Networks
James Halverson, Fabian Ruehle
Machine Learning: Science and Technology, 2024, Volume 5, Number 4 [ arXiv:2310.19870 ]

Abstract We develop a theory of flows in the space of Riemannian metrics induced by neural network gradient descent. This is motivated in part by recent advances in approximating Calabi-Yau metrics with neural networks and is enabled by recent advances in understanding flows in the space of neural networks. We derive the corresponding metric flow equations, which are governed by a metric neural tangent kernel, a complicated, non-local object that evolves in time. However, many architectures admit an infinite-width limit in which the kernel becomes fixed and the dynamics simplify. Additional assumptions can induce locality in the flow, which allows for the realization of Perelman's formulation of Ricci flow that was used to resolve the 3d Poincaré conjecture. We apply these ideas to numerical Calabi-Yau metrics, including a discussion on the importance of feature learning.

Functional renormalization group for signal detection and stochastic ergodicity breaking
Harold Erbin, Riccardo Finotello, Bio Wahabou Kpera, Vincent Lahoche, Dine Ousmane Samary
Journal of Statistical Mechanics: Theory and Experiment, Volume 2024 [ arXiv:2310.07499 ]

Abstract Signal detection is one of the main challenges of data science. As it often happens in data analysis, the signal in the data may be corrupted by noise. There is a wide range of techniques aimed at extracting the relevant degrees of freedom from data. However, some problems remain difficult. It is notably the case of signal detection in almost continuous spectra when the signal-to-noise ratio is small enough. This paper follows a recent bibliographic line which tackles this issue with field-theoretical methods. Previous analysis focused on equilibrium Boltzmann distributions for some effective field representing the degrees of freedom of data. It was possible to establish a relation between signal detection and ℤ2-symmetry breaking. In this paper, we consider a stochastic field framework inspiring by the so-called 'Model A', and show that the ability to reach or not an equilibrium state is correlated with the shape of the dataset. In particular, studying the renormalization group of the model, we show that the weak ergodicity prescription is always broken for signals small enough, when the data distribution is close to the Marchenko-Pastur (MP) law. This, in particular, enables the definition of a detection threshold in the regime where the signal-to-noise ratio is small enough.

Advances in machine-learning-based sampling motivated by lattice quantum chromodynamics
Kyle Cranmer, Gurtej Kanwar, Sébastien Racanière, Danilo J. Rezende, Phiala E. Shanahan
Nature Reviews Physics, 2023, Volume 5 [ arXiv:2309.01156 ]

Abstract Sampling from known probability distributions is a ubiquitous task in computational science, underlying calculations in domains from linguistics to biology and physics. Generative machine-learning (ML) models have emerged as a promising tool in this space, building on the success of this approach in applications such as image, text, and audio generation. Often, however, generative tasks in scientific domains have unique structures and features -- such as complex symmetries and the requirement of exactness guarantees -- that present both challenges and opportunities for ML. This Perspective outlines the advances in ML-based sampling motivated by lattice quantum field theory, in particular for the theory of quantum chromodynamics. Enabling calculations of the structure and interactions of matter from our most fundamental understanding of particle physics, lattice quantum chromodynamics is one of the main consumers of open-science supercomputing worldwide. The design of ML algorithms for this application faces profound challenges, including the necessity of scaling custom ML architectures to the largest supercomputers, but also promises immense benefits, and is spurring a wave of development in ML-based sampling more broadly. In lattice field theory, if this approach can realize its early promise it will be a transformative step towards first-principles physics calculations in particle, nuclear and condensed matter physics that are intractable with traditional approaches.

Signal-to-noise improvement through neural network contour deformations for 3D SU(2) lattice gauge theory
William Detmold, Gurtej Kanwar, Yin Lin, Phiala E. Shanahan, Michael L. Wagman
[ arXiv:2309.00600 ]

Abstract Complex contour deformations of the path integral have been demonstrated to significantly improve the signal-to-noise ratio of observables in previous studies of two-dimensional gauge theories with open boundary conditions. In this work, new developments based on gauge fixing and a neural network definition of the deformation are introduced, which enable an effective application to theories in higher dimensions and with generic boundary conditions. Improvements of the signal-to-noise ratio by up to three orders of magnitude for Wilson loop measurements are shown in SU(2) lattice gauge theory in three spacetime dimensions.

Reconstructing S-matrix Phases with Machine Learning
Aurélien Dersy, Matthew D. Schwartz, Alexander Zhiboedov
Journal of High Energy Physics, Volume 2024, Article 200 [ arXiv:2308.09451 | code ]

Abstract An important element of the S-matrix bootstrap program is the relationship between the modulus of an S-matrix element and its phase. Unitarity relates them by an integral equation. Even in the simplest case of elastic scattering, this integral equation cannot be solved analytically and numerical approaches are required. We apply modern machine learning techniques to studying the unitarity constraint. We find that for a given modulus, when a phase exists it can generally be reconstructed to good accuracy with machine learning. Moreover, the loss of the reconstruction algorithm provides a good proxy for whether a given modulus can be consistent with unitarity at all. In addition, we study the question of whether multiple phases can be consistent with a single modulus, finding novel phase-ambiguous solutions. In particular, we find a new phase-ambiguous solution which pushes the known limit on such solutions significantly beyond the previous bound.

Gravitational action for a massive Majorana fermion in 2d quantum gravity
Corinne de Lacroix, Harold Erbin, Vincent Lahoche
Journal of High Energy Physics, 2024, Volume 2024, Article number 68 [ arXiv:2308.08342 ]

Abstract We compute the gravitational action of a free massive Majorana fermion coupled to two-dimensional gravity on compact Riemann surfaces of arbitrary genus. The structure is similar to the case of the massive scalar. The small-mass expansion of the gravitational yields the Liouville action at zeroth order, and we can identify the Mabuchi action at first order. While the massive Majorana action is a conformal deformation of the massless Majorana CFT, we find an action different from the one given by the David-Distler-Kawai (DDK) ansatz.

Score-based Diffusion Models for Generating Liquid Argon Time Projection Chamber Images
Zeviel Imani, Shuchin Aeron, Taritree Wongjirad
[ arXiv:2307.13687 | code ]

Abstract We show for the first time, high-fidelity generation of LArTPC-like data using a generative neural network. This demonstrates that methods developed for natural images do transfer to LArTPC-produced images which in contrast to natural images are globally sparse, but locally dense. We present the method we employ, which is a variant of score-based generative diffusion models. We evaluate the fidelity of the generated images using several different approaches that include using a variant of measures used to evaluate natural images, comparisons between high-dimensional distributions, and comparisons relevant to LArTPC experiments.

Neural Network Field Theories: Non-Gaussianity, Actions, and Locality
Mehmet Demirtas, James Halverson, Anindita Maiti, Matthew D. Schwartz, Keegan Stoner
[ arXiv:2307.03223 ]

Abstract Both the path integral measure in field theory and ensembles of neural networks describe distributions over functions. When the central limit theorem can be applied in the infinite-width (infinite-N) limit, the ensemble of networks corresponds to a free field theory. Although an expansion in 1/N corresponds to interactions in the field theory, others, such as in a small breaking of the statistical independence of network parameters, can also lead to interacting theories. These other expansions can be advantageous over the 1/N-expansion, for example by improved behavior with respect to the universal approximation theorem. Given the connected correlators of a field theory, one can systematically reconstruct the action order-by-order in the expansion parameter, using a new Feynman diagram prescription whose vertices are the connected correlators. This method is motivated by the Edgeworth expansion and allows one to derive actions for neural network field theories. Conversely, the correspondence allows one to engineer architectures realizing a given field theory by representing action deformations as deformations of neural network parameter densities. As an example, ϕ4 theory is realized as an infinite-N neural network field theory.

Hierarchical Neural Simulation-Based Inference Over Event Ensembles
Lukas Heinrich, Siddharth Mishra-Sharma, Chris Pollard, Philipp Windischhofer
[ arXiv:2306.12584 | code ]

Abstract When analyzing real-world data it is common to work with event ensembles, which comprise sets of observations that collectively constrain the parameters of an underlying model of interest. Such models often have a hierarchical structure, where "local" parameters impact individual events and "global" parameters influence the entire dataset. We introduce practical approaches for optimal dataset-wide probabilistic inference in cases where the likelihood is intractable, but simulations can be realized via forward modeling. We construct neural estimators for the likelihood(-ratio) or posterior and show that explicitly accounting for the model's hierarchical structure can lead to tighter parameter constraints. We ground our discussion using case studies from the physical sciences, focusing on examples from particle physics (particle collider data) and astrophysics (strong gravitational lensing observations).

Quantum Computation and Simulation using Fermion-Pair Registers
Xiangkai Sun, Di Luo, Soonwon Choi
[ arXiv:2306.03905 ]

Abstract We propose and analyze an approach to realize quantum computation and simulation using fermionic particles under quantum gas microscopes. Our work is inspired by a recent experimental demonstration of large-scale quantum registers, where tightly localized fermion pairs are used to encode qubits exhibiting long coherence time and robustness against laser intensity noise. We describe how to engineer the SWAP gate and high-fidelity controlled-phase gates by adjusting the fermion hopping as well as Feshbach interaction strengths. Combined with previously demonstrated single-qubit rotations, these gates establish the computational universality of the system. Furthermore, we show that 2D quantum Ising Hamiltonians with tunable transverse and longitudinal fields can be efficient simulated by modulating Feshbach interaction strengths. We present a sample-efficient protocol to characterize engineered gates and Hamiltonian dynamics based on an improved classical shadow process tomography that requires minimal experimental controls. Our work opens up new opportunities to harness existing ultracold quantum gases for quantum information sciences.

Constraint of pionless EFT using two-nucleon spectra from lattice QCD
William Detmold, Fernando Romero-López, Phiala E. Shanahan
[ arXiv:2305.06313 ]

Abstract Finite-volume pionless effective field theory (FVEFTπ/) at next-to-leading order (NLO) is used to analyze the two-nucleon lattice QCD spectrum of Ref.~\cite{Amarasinghe:2021lqa}, performed at quark masses corresponding to a pion mass of approximately 800 MeV. Specifically, the effective theory is formulated in finite volume, and variational sets of wave functions are optimized using differential programming. Using these wave functions projected to the appropriate finite-volume symmetry group, variational bounds from FVEFTπ/ are obtained for the ground state, as well as excited states. By comparison with the lattice QCD GEVP spectrum, different low energy constants (LECs) are constrained. Relativistic corrections are incorporated, allowing for the extractions of NLO LECs, as well as the leading s-d-wave mixing term in the deuteron channel.

A Spectral Metric for Collider Geometry
Andrew J. Larkoski, Jesse Thaler
Journal of High Energy Physics 2023, Volume 2023, article number 107 [ arXiv:2305.03751 ]

Abstract By quantifying the distance between two collider events, one can triangulate a metric space and reframe collider data analysis as computational geometry. One popular geometric approach is to first represent events as an energy flow on an idealized celestial sphere and then define the metric in terms of optimal transport in two dimensions. In this paper, we advocate for representing events in terms of a spectral function that encodes pairwise particle angles and products of particle energies, which enables a metric distance defined in terms of one-dimensional optimal transport. This approach has the advantage of automatically incorporating obvious isometries of the data, like rotations about the colliding beam axis. It also facilitates first-principles calculations, since there are simple closed-form expressions for optimal transport in one dimension. Up to isometries and event sets of measure zero, the spectral representation is unique, so the metric on the space of spectral functions is a metric on the space of events. At lowest order in perturbation theory in electron-positron collisions, our metric is simply the summed squared invariant masses of the two event hemispheres. Going to higher orders, we present predictions for the distribution of metric distances between jets in fixed-order and resummed perturbation theory as well as in parton-shower generators. Finally, we speculate on whether the spectral approach could furnish a useful metric on the space of quantum field theories.

Normalizing flows for lattice gauge theory in arbitrary space-time dimension
Ryan Abbott, Michael S. Albergo, Aleksandar Botev, Denis Boyda, Kyle Cranmer, Daniel C. Hackett, Gurtej Kanwar, Alexander G.D.G. Matthews, Sébastien Racanière, Ali Razavi, Danilo J. Rezende, Fernando Romero-López, Phiala E. Shanahan, Julian M. Urban
[ arXiv:2305.02402 | code ]

Abstract Applications of normalizing flows to the sampling of field configurations in lattice gauge theory have so far been explored almost exclusively in two space-time dimensions. We report new algorithmic developments of gauge-equivariant flow architectures facilitating the generalization to higher-dimensional lattice geometries. Specifically, we discuss masked autoregressive transformations with tractable and unbiased Jacobian determinants, a key ingredient for scalable and asymptotically exact flow-based sampling algorithms. For concreteness, results from a proof-of-principle application to SU(3) lattice gauge theory in four space-time dimensions are reported.

Searching for ribbons with machine learning
Sergei Gukov, James Halverson, Ciprian Manolescu, Fabian Ruehle
[ arXiv:2304.09304 | code ]

Abstract We apply Bayesian optimization and reinforcement learning to a problem in topology: the question of when a knot bounds a ribbon disk. This question is relevant in an approach to disproving the four-dimensional smooth Poincaré conjecture; using our programs, we rule out many potential counterexamples to the conjecture. We also show that the programs are successful in detecting many ribbon knots in the range of up to 70 crossings.

Correlation function distributions for O(N) lattice field theories in the disordered phase
Cagin Yunus, William Detmold
[ arXiv:2304.03820 ]

Abstract Numerical computations in strongly-interacting quantum field theories are often performed using Monte-Carlo sampling methods. A key task in these calculations is to estimate the value of a given physical quantity from the distribution of stochastic samples that are generated using the Monte-Carlo method. Typically, the sample mean and sample variance are used to define the expectation values and uncertainties of computed quantities. However, the Monte-Carlo sample distribution contains more information than these basic properties and it is useful to investigate it more generally. In this work, the exact form of the probability distributions of two-point correlation functions at zero momentum in O(N) lattice field theories in the disordered phase and in infinite volume are determined. These distributions allow for a robust investigation of the efficacy of the Monte-Carlo sampling procedure and are shown also to allow for improved estimators of the target physical quantity to be constructed. The theoretical expectations are shown to agree with numerical calculations in the O(2) model.

ANTN: Bridging Autoregressive Neural Networks and Tensor Networks for Quantum Many-Body Simulation
Zhuo Chen, Laker Newhouse, Eddie Chen, Di Luo, Marin Soljačić
[ arXiv:2304.01996 | code ]

Abstract Quantum many-body physics simulation has important impacts on understanding fundamental science and has applications to quantum materials design and quantum technology. However, due to the exponentially growing size of the Hilbert space with respect to the particle number, a direct simulation is intractable. While representing quantum states with tensor networks and neural networks are the two state-of-the-art methods for approximate simulations, each has its own limitations in terms of expressivity and optimization. To address these challenges, we develop a novel architecture, Autoregressive Neural TensorNet (ANTN), which bridges tensor networks and autoregressive neural networks. We show that Autoregressive Neural TensorNet parameterizes normalized wavefunctions with exact sampling, generalizes the expressivity of tensor networks and autoregressive neural networks, and inherits a variety of symmetries from autoregressive neural networks. We demonstrate our approach on the 2D J1-J2 Heisenberg model with different systems sizes and coupling parameters, outperforming both tensor networks and autoregressive neural networks. Our work opens up new opportunities for both scientific simulations and machine learning applications.

Level Crossings, Attractor Points and Complex Multiplication
Hamza Ahmed, Fabian Ruehle
Journal of High Energy Physics, 2023, Volume 2023, Article number 164 [ arXiv:2304.00027 ]

Abstract We study the complex structure moduli dependence of the scalar Laplacian eigenmodes for one-parameter families of Calabi-Yau n-folds in P^{n+1}. It was previously observed that some eigenmodes get lighter while others get heavier as a function of these moduli, which leads to eigenvalue crossing. We identify the cause for this behavior for the torus. We then show that at points in a sublocus of complex structure moduli space where Laplacian eigenmodes cross, the torus has complex multiplication. We speculate that the generalization to arbitrary Calabi-Yau manifolds could be that level crossing is related to rank one attractor points. To test this, we compute the eigenmodes numerically for the quartic K3 and the quintic threefold, and match crossings to CM and attractor points in these varieties. To quantify the error of our numerical methods, we also study the dependence of the numerical spectrum on the quality of the Calabi-Yau metric approximation, the number of points sampled from the Calabi-Yau variety, the truncation of the eigenbasis, and the the distance from degeneration points in complex structure moduli space.

Artificial intelligence for artificial materials: moiré atom
Di Luo, Aidan P. Reddy, Trithep Devakul, Liang Fu
[ arXiv:2303.08162 ]

Abstract Moiré engineering in atomically thin van der Waals heterostructures creates artificial quantum materials with designer properties. We solve the many-body problem of interacting electrons confined to a moiré superlattice potential minimum (the moiré atom) using a 2D fermionic neural network. We show that strong Coulomb interactions in combination with the anisotropic moiré potential lead to striking ``Wigner molecule" charge density distributions observable with scanning tunneling microscopy.

Exploring the CP-violating Dashen phase in the Schwinger model with tensor networks
Lena Funcke, Karl Jansen, Stefan Kühn
Physical Review D, 2023, Volume 108, Issue 1 [ arXiv:2303.03799 ]

Abstract We numerically study the phase structure of the two-flavor Schwinger model with matrix product states, focusing on the (1+1)-dimensional analog of the CP-violating Dashen phase in QCD. We simulate the two-flavor Schwinger model around the point where the positive mass of one fermion flavor corresponds to the negative mass of the other fermion flavor, which is a sign-problem afflicted regime for conventional Monte Carlo techniques. Our results indicate that the model undergoes a CP-violating Dashen phase transition at this point, which manifests itself in abrupt changes of the average electric field and the analog of the pion condensate in the model. Studying the scaling of the bipartite entanglement entropy as a function of the volume, we find clear indications that this transition is not of first order.

Computational Mirror Symmetry
Mehmet Demirtas, Manki Kim, Liam McAllister, Jakob Moritz, Andres Rios-Tascon
Journal of High Energy Physics, 2024, Volume 2024, Article number 184 [ arXiv:2303.00757 ]

Abstract We present an efficient algorithm for computing the prepotential in compactifications of type II string theory on mirror pairs of Calabi-Yau threefolds in toric varieties. Applying this method, we exhibit the first systematic computation of genus-zero Gopakumar-Vafa invariants in compact threefolds with many moduli, including examples with up to 491 vector multiplets.

Q-Flow: Generative Modeling for Differential Equations of Open Quantum Dynamics with Normalizing Flows
Owen Dugan, Peter Y. Lu, Rumen Dangovski, Di Luo, Marin Soljačić
[ arXiv:2302.12235 ]

Abstract Studying the dynamics of open quantum systems holds the potential to enable breakthroughs both in fundamental physics and applications to quantum engineering and quantum computation. Due to the high-dimensional nature of the problem, customized deep generative neural networks have been instrumental in modeling the high-dimensional density matrix ρ, which is the key description for the dynamics of such systems. However, the complex-valued nature and normalization constraints of ρ, as well as its complicated dynamics, prohibit a seamless connection between open quantum systems and the recent advances in deep generative modeling. Here we lift that limitation by utilizing a reformulation of open quantum system dynamics to a partial differential equation (PDE) for a corresponding probability distribution Q, the Husimi Q function. Thus, we model the Q function seamlessly with off-the-shelf deep generative models such as normalizing flows. Additionally, we develop novel methods for learning normalizing flow evolution governed by high-dimensional PDEs, based on the Euler method and the application of the time-dependent variational principle. We name the resulting approach Q-Flow and demonstrate the scalability and efficiency of Q-Flow on open quantum system simulations, including the dissipative harmonic oscillator and the dissipative bosonic model. Q-Flow is superior to conventional PDE solvers and state-of-the-art physics-informed neural network solvers, especially in high-dimensional systems.

SHAPER: Can You Hear the Shape of a Jet?
Demba Ba, Akshunna S. Dogra, Rikab Gambhir, Abiy Tasissa, Jesse Thaler
Journal of High Energy Physics, 2023, Volume 2023, Article 195 [ arXiv:2302.12266 | code ]

Abstract The identification of interesting substructures within jets is an important tool for searching for new physics and probing the Standard Model at colliders. Many of these substructure tools have previously been shown to take the form of optimal transport problems, in particular the Energy Mover's Distance (EMD). In this work, we show that the EMD is in fact the natural structure for comparing collider events, which accounts for its recent success in understanding event and jet substructure. We then present a Shape Hunting Algorithm using Parameterized Energy Reconstruction (SHAPER), which is a general framework for defining and computing shape-based observables. SHAPER generalizes N-jettiness from point clusters to any extended, parametrizable shape. This is accomplished by efficiently minimizing the EMD between events and parameterized manifolds of energy flows representing idealized shapes, implemented using the dual-potential Sinkhorn approximation of the Wasserstein metric. We show how the geometric language of observables as manifolds can be used to define novel observables with built-in infrared-and-collinear safety. We demonstrate the efficacy of the SHAPER framework by performing empirical jet substructure studies using several examples of new shape-based observables.

Geometry of contact: contact planning for multi-legged robots via spin models duality
Baxi Chong, Di Luo, Tianyu Wang, Gabriel Margolis, Juntao He, Pulkit Agrawal, Marin Soljačić, Daniel I. Goldman
[ arXiv:2302.03019 ]

Abstract Contact planning is crucial in locomoting systems.Specifically, appropriate contact planning can enable versatile behaviors (e.g., sidewinding in limbless locomotors) and facilitate speed-dependent gait transitions (e.g., walk-trot-gallop in quadrupedal locomotors). The challenges of contact planning include determining not only the sequence by which contact is made and broken between the locomotor and the environments, but also the sequence of internal shape changes (e.g., body bending and limb shoulder joint oscillation). Most state-of-art contact planning algorithms focused on conventional robots (e.g.biped and quadruped) and conventional tasks (e.g. forward locomotion), and there is a lack of study on general contact planning in multi-legged robots. In this paper, we show that using geometric mechanics framework, we can obtain the global optimal contact sequence given the internal shape changes sequence. Therefore, we simplify the contact planning problem to a graph optimization problem to identify the internal shape changes. Taking advantages of the spatio-temporal symmetry in locomotion, we map the graph optimization problem to special cases of spin models, which allows us to obtain the global optima in polynomial time. We apply our approach to develop new forward and sidewinding behaviors in a hexapod and a 12-legged centipede. We verify our predictions using numerical and robophysical models, and obtain novel and effective locomotion behaviors.

EPiC-GAN: Equivariant Point Cloud Generation for Particle Jets
Erik Buhmann, Gregor Kasieczka, Jesse Thaler
SciPost Physics, 2023, Volume 15, Issue 4 [ arXiv:2301.08128 | code ]

Abstract With the vast data-collecting capabilities of current and future high-energy collider experiments, there is an increasing demand for computationally efficient simulations. Generative machine learning models enable fast event generation, yet so far these approaches are largely constrained to fixed data structures and rigid detector geometries. In this paper, we introduce EPiC-GAN - equivariant point cloud generative adversarial network - which can produce point clouds of variable multiplicity. This flexible framework is based on deep sets and is well suited for simulating sprays of particles called jets. The generator and discriminator utilize multiple EPiC layers with an interpretable global latent vector. Crucially, the EPiC layers do not rely on pairwise information sharing between particles, which leads to a significant speed-up over graph- and transformer-based approaches with more complex relation diagrams. We demonstrate that EPiC-GAN scales well to large particle multiplicities and achieves high generation fidelity on benchmark jet generation tasks.

Comparing Point Cloud Strategies for Collider Event Classification
Peter Onyisi, Delon Shen, Jesse Thaler
Physical Review D, 2023, Volume 108, Issue 1 [ arXiv:2212.10659 | code ]

Abstract In this paper, we compare several event classification architectures defined on the point cloud representation of collider events. These approaches, which are based on the frameworks of deep sets and edge convolutions, circumvent many of the difficulties associated with traditional feature engineering. To benchmark our architectures against more traditional event classification strategies, we perform a case study involving Higgs boson decays to tau leptons. We find a 2.5 times increase in performance compared to a baseline ATLAS analysis with engineered features. Our point cloud architectures can be viewed as simplified versions of graph neural networks, where each particle in the event corresponds to a graph node. In our case study, we find the best balance of performance and computational cost for simple pairwise architectures, which are based on learned edge features.

Simulating 2+1D Lattice Quantum Electrodynamics at Finite Density with Neural Flow Wavefunctions
Zhuo Chen, Di Luo, Kaiwen Hu, Bryan K. Clark
[ arXiv:2212.06835 ]

Abstract We present a neural flow wavefunction, Gauge-Fermion FlowNet, and use it to simulate 2+1D lattice compact quantum electrodynamics with finite density dynamical fermions. The gauge field is represented by a neural network which parameterizes a discretized flow-based transformation of the amplitude while the fermionic sign structure is represented by a neural net backflow. This approach directly represents the U(1) degree of freedom without any truncation, obeys Guass's law by construction, samples autoregressively avoiding any equilibration time, and variationally simulates Gauge-Fermion systems with sign problems accurately. In this model, we investigate confinement and string breaking phenomena in different fermion density and hopping regimes. We study the phase transition from the charge crystal phase to the vacuum phase at zero density, and observe the phase seperation and the net charge penetration blocking effect under magnetic interaction at finite density. In addition, we investigate a magnetic phase transition due to the competition effect between the kinetic energy of fermions and the magnetic energy of the gauge field. With our method, we further note potential differences on the order of the phase transitions between a continuous U(1) system and one with finite truncation. Our state-of-the-art neural network approach opens up new possibilities to study different gauge theories coupled to dynamical matter in higher dimensions.

Yang-Mills glueball masses from spectral reconstruction
Jan M. Pawlowski, Coralie S. Schneider, Jonas Turnwald, Julian M. Urban, Nicolas Wink
Physical Review D 2023, Volume 108, Issue 7 [ arXiv:2212.01113 ]

Abstract We compute masses of the two lightest glueballs from spectral reconstructions of timelike interaction channels of the four-gluon vertex in Landau gauge Yang-Mills theory. The Euclidean spacelike dressings of the vertex are calculated with the functional renormalisation group. For the spectral reconstruction of these Euclidean data, we employ Gaussian process regression. The glueball resonances can be identified straightforwardly and we obtain msc=1870(75) MeV as well as mps=2700(120) MeV, in accordance with functional bound state and lattice calculations.

Characterizing 4-string contact interaction using machine learning
Harold Erbin, Atakan Hilmi Fırat
Journal of High Energy Physics, 2024, Article 16 [ arXiv:2211.09129 | code ]

Abstract The geometry of 4-string contact interaction of closed string field theory is characterized using machine learning. We obtain Strebel quadratic differentials on 4-punctured spheres as a neural network by performing unsupervised learning with a custom-built loss function. This allows us to solve for local coordinates and compute their associated mapping radii numerically. We also train a neural network distinguishing vertex from Feynman region. As a check, 4-tachyon contact term in the tachyon potential is computed and a good agreement with the results in the literature is observed. We argue that our algorithm is manifestly independent of number of punctures and scaling it to characterize the geometry of n-string contact interaction is feasible.

Aspects of scaling and scalability for flow-based sampling of lattice QCD
Ryan Abbott, Michael S. Albergo, Aleksandar Botev, Denis Boyda, Kyle Cranmer, Daniel C. Hackett, Alexander G. D. G. Matthews, Sébastien Racanière, Ali Razavi, Danilo J. Rezende, Fernando Romero-López, Phiala E. Shanahan, Julian M. Urban
The European Physical Journal A 2023, Volume 59, Article Number 257 [ arXiv:2211.07541 ]

Abstract Recent applications of machine-learned normalizing flows to sampling in lattice field theory suggest that such methods may be able to mitigate critical slowing down and topological freezing. However, these demonstrations have been at the scale of toy models, and it remains to be determined whether they can be applied to state-of-the-art lattice quantum chromodynamics calculations. Assessing the viability of sampling algorithms for lattice field theory at scale has traditionally been accomplished using simple cost scaling laws, but as we discuss in this work, their utility is limited for flow-based approaches. We conclude that flow-based approaches to sampling are better thought of as a broad family of algorithms with different scaling properties, and that scalability must be assessed experimentally.

Gauge Equivariant Neural Networks for 2+1D U(1) Gauge Theory Simulations in Hamiltonian Formulation
Di Luo, Shunyue Yuan, James Stokes, Bryan K. Clark
[ arXiv:2211.03198 ]

Abstract Gauge Theory plays a crucial role in many areas in science, including high energy physics, condensed matter physics and quantum information science. In quantum simulations of lattice gauge theory, an important step is to construct a wave function that obeys gauge symmetry. In this paper, we have developed gauge equivariant neural network wave function techniques for simulating continuous-variable quantum lattice gauge theories in the Hamiltonian formulation. We have applied the gauge equivariant neural network approach to find the ground state of 2+1-dimensional lattice gauge theory with U(1) gauge group using variational Monte Carlo. We have benchmarked our approach against the state-of-the-art complex Gaussian wave functions, demonstrating improved performance in the strong coupling regime and comparable results in the weak coupling regime.

QuACK: Accelerating Gradient-Based Quantum Optimization with Koopman Operator Learning
Di Luo, Jiayu Shen, Rumen Dangovski, Marin Soljačić
[ arXiv:2211.01365 ]

Abstract Finding efficient optimization methods plays an important role for quantum optimization and quantum machine learning on near-term quantum computers. While backpropagation on classical computers is computationally efficient, obtaining gradients on quantum computers is not, because the computational complexity usually scales with the number of parameters and measurements. In this paper, we connect Koopman operator theory, which has been successful in predicting nonlinear dynamics, with natural gradient methods in quantum optimization. We propose a data-driven approach using Koopman operator learning to accelerate quantum optimization and quantum machine learning. We develop two new families of methods: the sliding window dynamic mode decomposition (DMD) and the neural DMD for efficiently updating parameters on quantum computers. We show that our methods can predict gradient dynamics on quantum computers and accelerate the variational quantum eigensolver used in quantum optimization, as well as quantum machine learning. We further implement our Koopman operator learning algorithm on a real IBM quantum computer and demonstrate their practical effectiveness.

Large-time correlation functions in bosonic lattice field theories
Cagin Yunus, William Detmold
Physics Letter B, 2023, Volume 840, 137890 [ arXiv:2210.15789 ]

Abstract Large-time correlation functions have a pivotal role in extracting particle masses from Euclidean lattice field theory calculations, however little is known about the statistical properties of these quantities. In this work, the asymptotic form of the distributions of the correlation functions at vanishing momentum is determined for bosonic interacting lattice field theories with a unique gapped vacuum. It is demonstrated that the deviations from the asymptotic form at large Euclidean times can be utilized to determine the spectrum of the theory.

Deep Learning for Bayesian Optimization of Scientific Problems with High-Dimensional Structure
Samuel Kim, Peter Y. Lu, Charlotte Loh, Jamie Smith, Jasper Snoek, Marin Soljačić
Transactions on Machine Learning Research 2022 [ ]

Abstract Bayesian optimization (BO) is a popular paradigm for global optimization of expensive black-box functions, but there are many domains where the function is not completely a black-box. The data may have some known structure (e.g. symmetries) and/or the data generation process may be a composite process that yields useful intermediate or auxiliary information in addition to the value of the optimization objective. However, surrogate models traditionally employed in BO, such as Gaussian Processes (GPs), scale poorly with dataset size and do not easily accommodate known structure. Instead, we use Bayesian neural networks, a class of scalable and flexible surrogate models with inductive biases, to extend BO to complex, structured problems with high dimensionality. We demonstrate BO on a number of realistic problems in physics and chemistry, including topology optimization of photonic crystal materials using convolutional neural networks, and chemical property optimization of molecules using graph neural networks. On these complex tasks, we show that neural networks often outperform GPs as surrogate models for BO in terms of both sampling efficiency and computational cost.

Data-driven Acceleration of Quantum Optimization and Machine Learning via Koopman Operator Learning
Di Luo, Jiayu Shen, Rumen Dangovski, Marin Soljacic
NeurIPS 2022 Workshop AI4Science [ ]

Abstract Efficient optimization methods play a crucial role for quantum optimization and machine learning on near-term quantum computers. Unlike classical computers, obtaining gradients on quantum computers is costly with sample complexity scaling with the number of parameters and measurements. In this paper, we connect the natural gradient method in quantum optimization with Koopman operator theory, which provides a powerful framework for predicting nonlinear dynamics. We propose a data-driven approach for accelerating quantum optimization and machine learning via Koopman operator learning. To predict parameter updates on quantum computers, we develop new methods including the sliding window dynamic mode decomposition (DMD) and the neural-network-based DMD. We apply our methods both on simulations and real quantum hardware. We demonstrate efficient prediction and acceleration of gradient optimization on the variational quantum eigensolver and quantum machine learning.

Symmetries of Calabi-Yau Prepotentials with Isomorphic Flops
Andre Lukas, Fabian Ruehle
Journal of High Energy 2023, Article 175 [ arXiv:2210.09369 ]

Abstract Calabi-Yau threefolds with infinitely many flops to isomorphic manifolds have an extended Kahler cone made up from an infinite number of individual Kahler cones. These cones are related by reflection symmetries across flop walls. We study the implications of this cone structure for mirror symmetry, by considering the instanton part of the prepotential in Calabi-Yau threefolds. We show that such isomorphic flops across facets of the Kahler cone boundary give rise to symmetry groups isomorphic to Coxeter groups. In the dual Mori cone, non-flopping curve classes that are identified under these groups have the same Gopakumar-Vafa invariants. This leads to instanton prepotentials invariant under Coxeter groups, which we make manifest by introducing appropriate invariant functions. For some cases, these functions can be expressed in terms of theta functions whose appearance can be linked to an elliptic fibration structure of the Calabi-Yau manifold.

Electric-Magnetic Duality in a Class of G2-Compactifications of M-theory
James Halverson, Benjamin Sung, Jiahua Tian
Journal of High Energy Physics, 2023, Volume 2023, Article 89 [ arXiv:2210.08628 ]

Abstract We study electric-magnetic duality in compactifications of M-theory on twisted connected sum (TCS) G2 manifolds via duality with F-theory. Specifically, we study the physics of the D3-branes in F-theory compactified on a Calabi-Yau fourfold Y, dual to a compactification of M-theory on a TCS G2 manifold X. =2 supersymmetry is restored in an appropriate geometric limit. In that limit, we demonstrate that the dual of D3-branes probing seven-branes corresponds to the shrinking of certain surfaces and curves, yielding light particles that may carry both electric and magnetic charges. We provide evidence that the Minahan-Nemeschansky theories with En flavor symmetry may be realized in this way. The SL(2,ℤ) monodromy of the 3/7-brane system is dual to a Fourier-Mukai transform of the dual IIA/M-theory geometry in this limit, and we extrapolate this monodromy action to the global compactification. Away from the limit, the theory is broken to =1 supersymmetry by a D-term.

Learning to Optimize Quasi-Newton Methods
Isaac Liao, Rumen R. Dangovski, Jakob N. Foerster, Marin Soljačić
Transactions on Machine Learning Research, 2023 [ arXiv:2210.06171 ]

Abstract Fast gradient-based optimization algorithms have become increasingly essential for the computationally efficient training of machine learning models. One technique is to multiply the gradient by a preconditioner matrix to produce a step, but it is unclear what the best preconditioner matrix is. This paper introduces a novel machine learning optimizer called LODO, which tries to online meta-learn the best preconditioner during optimization. Specifically, our optimizer merges Learning to Optimize (L2O) techniques with quasi-Newton methods to learn preconditioners parameterized as neural networks; they are more flexible than preconditioners in other quasi-Newton methods. Unlike other L2O methods, LODO does not require any meta-training on a training task distribution, and instead learns to optimize on the fly while optimizing on the test task, adapting to the local characteristics of the loss landscape while traversing it. Theoretically, we show that our optimizer approximates the inverse Hessian in noisy loss landscapes and is capable of representing a wide range of inverse Hessians. We experimentally verify that our algorithm can optimize in noisy settings, and show that simpler alternatives for representing the inverse Hessians worsen performance. Lastly, we use our optimizer to train a semi-realistic deep neural network with 95k parameters at speeds comparable to those of standard neural network optimizers.

On the Importance of Calibration in Semi-supervised Learning
Charlotte Loh, Rumen Dangovski, Shivchander Sudalairaj, Seungwook Han, Ligong Han, Leonid Karlinsky, Marin Soljacic, Akash Srivastava
[ arXiv:2210.04783 ]

Abstract State-of-the-art (SOTA) semi-supervised learning (SSL) methods have been highly successful in leveraging a mix of labeled and unlabeled data by combining techniques of consistency regularization and pseudo-labeling. During pseudo-labeling, the model's predictions on unlabeled data are used for training and thus, model calibration is important in mitigating confirmation bias. Yet, many SOTA methods are optimized for model performance, with little focus directed to improve model calibration. In this work, we empirically demonstrate that model calibration is strongly correlated with model performance and propose to improve calibration via approximate Bayesian techniques. We introduce a family of new SSL models that optimizes for calibration and demonstrate their effectiveness across standard vision benchmarks of CIFAR-10, CIFAR-100 and ImageNet, giving up to 15.9% improvement in test accuracy. Furthermore, we also demonstrate their effectiveness in additional realistic and challenging problems, such as class-imbalanced datasets and in photonics science.

Degeneracy Engineering for Classical and Quantum Annealing: A Case Study of Sparse Linear Regression in Collider Physics
Eric R. Anschuetz, Lena Funcke, Patrick T. Komiske, Serhii Kryhin, Jesse Thaler
Physical Review D, Volume 106, Article 056008 [ arXiv:2205.10375 ]

Abstract Classical and quantum annealing are computing paradigms that have been proposed to solve a wide range of optimization problems. In this paper, we aim to enhance the performance of annealing algorithms by introducing the technique of degeneracy engineering, through which the relative degeneracy of the ground state is increased by modifying a subset of terms in the objective Hamiltonian. We illustrate this novel approach by applying it to the example of ℓ0-norm regularization for sparse linear regression, which is in general an NP-hard optimization problem. Specifically, we show how to cast ℓ0-norm regularization as a quadratic unconstrained binary optimization (QUBO) problem, suitable for implementation on annealing platforms. As a case study, we apply this QUBO formulation to energy flow polynomials in high-energy collider physics, finding that degeneracy engineering substantially improves the annealing performance. Our results motivate the application of degeneracy engineering to a variety of regularized optimization problems.

Discovering Conservation Laws using Optimal Transport and Manifold Learning
Peter Y. Lu, Rumen Dangovski, Marin Soljačić
Nature Communications [ arXiv:2208.14995 ]

Abstract Conservation laws are key theoretical and practical tools for understanding, characterizing, and modeling nonlinear dynamical systems. However, for many complex dynamical systems, the corresponding conserved quantities are difficult to identify, making it hard to analyze their dynamics and build efficient, stable predictive models. Current approaches for discovering conservation laws often depend on detailed dynamical information, such as the equation of motion or fine-grained time measurements, with many recent proposals also relying on black box parametric deep learning methods. We instead reformulate this task as a manifold learning problem and propose a non-parametric approach, combining the Wasserstein metric from optimal transport with diffusion maps, to discover conserved quantities that vary across trajectories sampled from a dynamical system. We test this new approach on a variety of physical systems—including conservative Hamiltonian systems, dissipative systems, and spatiotemporal systems—and demonstrate that our manifold learning method is able to both identify the number of conserved quantities and extract their values. Using tools from optimal transport theory and manifold learning, our proposed method provides a direct geometric approach to identifying conservation laws that is both robust and interpretable without requiring an explicit model of the system nor accurate time information.

Sampling QCD field configurations with gauge-equivariant flow models
Ryan Abbott, Michael S. Albergo, Aleksandar Botev, Denis Boyda, Kyle Cranmer, Daniel C. Hackett, Gurtej Kanwar, Alexander G. D. G. Matthews, Sébastien Racanière, Ali Razavi, Danilo J. Rezende, Fernando Romero-López, Phiala E. Shanahan, Julian M. Urban
[ arXiv:2208.03832 ]

Abstract Machine learning methods based on normalizing flows have been shown to address important challenges, such as critical slowing-down and topological freezing, in the sampling of gauge field configurations in simple lattice field theories. A critical question is whether this success will translate to studies of QCD. This Proceedings presents a status update on advances in this area. In particular, it is illustrated how recently developed algorithmic components may be combined to construct flow-based sampling algorithms for QCD in four dimensions. The prospects and challenges for future use of this approach in at-scale applications are summarized.

Confinement in non-Abelian lattice gauge theory via persistent homology
Daniel Spitz, Julian M. Urban, Jan M. Pawlowski
Physics Review D 2023, Volume 107, Issue 3 [ arXiv:2208.03955 ]

Abstract We investigate the structure of confining and deconfining phases in SU(2) lattice gauge theory via persistent homology, which gives us access to the topology of a hierarchy of combinatorial objects constructed from given data. Specifically, we use filtrations by traced Polyakov loops, topological densities, holonomy Lie algebra fields, as well as electric and magnetic fields. This allows for a comprehensive picture of confinement. In particular, topological densities form spatial lumps which show signatures of the classical probability distribution of instanton-dyons. Signatures of well-separated dyons located at random positions are encoded in holonomy Lie algebra fields, following the semi-classical temperature dependence of the instanton appearance probability. Debye screening discriminating between electric and magnetic fields is visible in persistent homology and pronounced at large gauge coupling. All employed constructions are gauge-invariant without a priori assumptions on the configurations under study. This work showcases the versatility of persistent homology for statistical and quantum physics studies, barely explored to date.

Gauge-equivariant flow models for sampling in lattice field theories with pseudofermions
Ryan Abbott, Michael S. Albergo, Denis Boyda, Kyle Cranmer, Daniel C. Hackett, Gurtej Kanwar, Sébastien Racanière, Danilo J. Rezende, Fernando Romero-López, Phiala E. Shanahan, Betsy Tian, Julian M. Urban
Physical REview D, 2022, Volume 106, Issue 7 [ arXiv:2207.08945 ]

Abstract This work presents gauge-equivariant architectures for flow-based sampling in fermionic lattice field theories using pseudofermions as stochastic estimators for the fermionic determinant. This is the default approach in state-of-the-art lattice field theory calculations, making this development critical to the practical application of flow models to theories such as QCD. Methods by which flow-based sampling approaches can be improved via standard techniques such as even/odd preconditioning and the Hasenbusch factorization are also outlined. Numerical demonstrations in two-dimensional U(1) and SU(3) gauge theories with Nf=2 flavors of fermions are provided.

Simplifying Polylogarithms with Machine Learning
Aurélien Dersy, Matthew D. Schwartz, Xiaoyuan Zhang
International Journal of Data Science in the Mathematical Sciences, Vol. 01, No. 02, pp. 135-179 (2023) [ arXiv:2206.04115 ]

Abstract Polylogrithmic functions, such as the logarithm or dilogarithm, satisfy a number of algebraic identities. For the logarithm, all the identities follow from the product rule. For the dilogarithm and higher-weight classical polylogarithms, the identities can involve five functions or more. In many calculations relevant to particle physics, complicated combinations of polylogarithms often arise from Feynman integrals. Although the initial expressions resulting from the integration usually simplify, it is often difficult to know which identities to apply and in what order. To address this bottleneck, we explore to what extent machine learning methods can help. We consider both a reinforcement learning approach, where the identities are analogous to moves in a game, and a transformer network approach, where the problem is viewed analogously to a language-translation task. While both methods are effective, the transformer network appears more powerful and holds promise for practical use in symbolic manipulation tasks in mathematical physics.

Power Counting Energy Flow Polynomials
Pedro Cal, Jesse Thaler, Wouter J. Waalewijn
Journal of High Energy Physics, 2022, Article 21 [ arXiv:2205.06818 ]

Abstract Power counting is a systematic strategy for organizing collider observables and their associated theoretical calculations. In this paper, we use power counting to characterize a class of jet substructure observables called energy flow polynomials (EFPs). EFPs provide an overcomplete linear basis for infrared-and-collinear safe jet observables, but it is known that in practice, a small subset of EFPs is often sufficient for specific jet analysis tasks. By applying power counting arguments, we obtain linear relationships between EFPs that hold for quark and gluon jets to a specific order in the power counting. We test these relations in the parton shower generator Pythia, finding excellent agreement. Power counting allows us to truncate the basis of EFPs without affecting performance, which we corroborate through a study of quark-gluon tagging and regression.

Disentangling Quarks and Gluons with CMS Open Data
Patrick T. Komiske, Serhii Kryhin, Jesse Thaler
Physical Review D, 2022, Volume 106 Article 094021 [ arXiv:2205.04459 ]

Abstract We study quark and gluon jets separately using public collider data from the CMS experiment. Our analysis is based on 2.3/fb of proton-proton collisions at 7 TeV, collected at the Large Hadron Collider in 2011. We define two non-overlapping samples via a pseudorapidity cut -- central jets with |eta| < 0.65 and forward jets with |eta| > 0.65 -- and employ jet topic modeling to extract individual distributions for the maximally separable categories. Under certain assumptions, such as sample independence and mutual irreducibility, these categories correspond to "quark" and "gluon" jets, as given by a recently proposed operational definition. We consider a number of different methods for extracting reducibility factors from the central and forward datasets, from which the fractions of quark jets in each sample can be determined. The greatest stability and robustness to statistical uncertainties is achieved by a novel method based on parametrizing the endpoints of a receiver operating characteristic (ROC) curve. To mitigate detector effects, which would otherwise induce unphysical differences between central and forward jets, we use the OmniFold method to perform central value unfolding. As a demonstration of the power of this method, we extract the intrinsic dimensionality of the quark and gluon jet samples, which exhibit Casimir scaling, as expected from the strongly-ordered limit. To our knowledge, this work is the first application of full phase space unfolding to real collider data, and one of the first applications of topic modeling to extract separate quark and gluon distributions at the LHC.

Infinite Variance in Monte Carlo Sampling of Lattice Field Theories
Cagin Yunus, William Detmold
Physical Review D, Volume 106, Article 094506 [ arXiv:2205.01001 ]

Abstract In Monte Carlo calculations of expectation values in lattice quantum field theories, the stochastic variance of the sampling procedure that is used defines the precision of the calculation for a fixed number of samples. If the variance of an estimator of a particular quantity is formally infinite, or in practice very large compared to the square of the mean, then that quantity can not be reliably estimated using the given sampling procedure. There are multiple scenarios in which this occurs, including in Lattice Quantum Chromodynamics, and a particularly simple example is given by the Gross-Neveu model where Monte Carlo calculations involve the introduction of auxiliary bosonic variables through a Hubbard-Stratonovich (HS) transformation. Here, it is shown that the variances of HS estimators for classes of operators involving fermion fields are divergent in this model and an even simpler zero-dimensional analogue. To correctly estimate these observables, two alternative sampling methods are proposed and numerically investigated.

Flow-based density of states for complex actions
Jan M. Pawlowski, Julian M. Urban
Physical Review D, 2023, Volume 108, Issue 5 [ arXiv:2203.01243 ]

Abstract Emerging sampling algorithms based on normalizing flows have the potential to solve ergodicity problems in lattice calculations. Furthermore, it has been noted that flows can be used to compute thermodynamic quantities which are difficult to access with traditional methods. This suggests that they are also applicable to the density-of-states approach to complex action problems. In particular, flow-based sampling may be used to compute the density directly, in contradistinction to the conventional strategy of reconstructing it via measuring and integrating the derivative of its logarithm. By circumventing this procedure, the accumulation of errors from the numerical integration is avoided completely and the overall normalization factor can be determined explicitly. In this proof-of-principle study, we demonstrate our method in the context of two-component scalar field theory where the O(2) symmetry is explicitly broken by an imaginary external field. First, we concentrate on the zero-dimensional case which can be solved exactly. We show that with our method, the Lee-Yang zeroes of the associated partition function can be successfully located. Subsequently, we confirm that the flow-based approach correctly reproduces the density computed with conventional methods in one- and two-dimensional models.

Creating Simple, Interpretable Anomaly Detectors for New Physics in Jet Substructure
Layne Bradshaw, Spencer Chang, Bryan Ostdiek
Physical Review D, 2022, Volume 106, Article 035014 [ arXiv:2203.01343 ]

Abstract Anomaly detection with convolutional autoencoders is a popular method to search for new physics in a model-agnostic manner. These techniques are powerful, but they are still a "black box," since we do not know what high-level physical observables determine how anomalous an event is. To address this, we adapt a recently proposed technique by Faucett this http URL, which maps out the physical observables learned by a neural network classifier, to the case of anomaly detection. We propose two different strategies that use a small number of high-level observables to mimic the decisions made by the autoencoder on background events. Despite the underlying differences in their approach, we find that both strategies have similar ordering performance as the autoencoder and independently use the same five high-level observables. From there, we compare the performance of these networks as anomaly detectors. We find that both strategies perform similarly to the autoencoder across a variety of signals, giving a nontrivial demonstration that learning to order background events transfers to ordering a variety of signal events.

Flow-based sampling in the lattice Schwinger model at criticality
Michael S. Albergo, Denis Boyda, Kyle Cranmer, Daniel C. Hackett, Gurtej Kanwar, Sébastien Racanière, Danilo J. Rezende, Fernando Romero-López, Phiala E. Shanahan, Julian M. Urban
Physical Review D, 2022, Volume 106, Article 014514 [ arXiv:2202.11712 ]

Abstract Recent results suggest that flow-based algorithms may provide efficient sampling of field distributions for lattice field theory applications, such as studies of quantum chromodynamics and the Schwinger model. In this work, we provide a numerical demonstration of robust flow-based sampling in the Schwinger model at the critical value of the fermion mass. In contrast, at the same parameters, conventional methods fail to sample all parts of configuration space, leading to severely underestimated uncertainties.

Identifying equivalent Calabi–Yau topologies: A discrete challenge from math and physics for machine learning
Vishnu Jejjala, Washington Taylor, Andrew Turner
[ arXiv:2202.07590 ]

Abstract We review briefly the characteristic topological data of Calabi–Yau threefolds and focus on the question of when two threefolds are equivalent through related topological data. This provides an interesting test case for machine learn- ing methodology in discrete mathematics problems motivated by physics.

Finite-Volume Pionless Effective Field Theory for Few-Nucleon Systems with Differentiable Programming
Xiangkai Sun, William Detmold, Di Luo, Phiala E. Shanahan
[ arXiv:2202.03530 ]

Abstract Finite-volume pionless effective field theory provides an efficient framework for the extrapolation of nuclear spectra and matrix elements calculated at finite volume in lattice QCD to infinite volume, and to nuclei with larger atomic number. In this work, it is demonstrated how this framework may be implemented via a set of correlated Gaussian wavefunctions optimised using differentiable programming and via solution of a generalised eigenvalue problem. This approach is shown to be significantly more efficient than a stochastic implementation of the variational method based on the same form of correlated Gaussian wavefunctions, yielding comparably accurate representations of the ground-state wavefunctions with an order of magnitude fewer terms. The efficiency of representation allows such calculations to be extended to larger systems than in previous work. The method is demonstrated through calculations of the binding energies of nuclei with atomic number A∈{2,3,4} in finite volume, matched to lattice QCD calculations at quark masses corresponding to mπ=806 MeV, and infinite-volume effective field theory calculations of A∈{2,3,4,5,6} systems based on this matching.

Strictification and gluing of Lagrangian distributions on derived schemes with shifted symplectic forms
Dennis Borisov, Ludmil Katzarkov, Artan Sheshmani, Shing-Tung Yau
Science Direct Journals, 2024, Volume 438 [ arXiv:1908.00651 ]

Abstract A strictification result is proved for isotropic distributions on derived schemes equipped with negatively shifted homotopically closed 2-forms. It is shown that any derived scheme over ℂ equipped with a −2-shifted symplectic structure, and having a Hausdorff space of classical points, admits a globally defined Lagrangian distribution as a dg ℂ∞-manifold.

Towards Quantum Simulations in Particle Physics and Beyond on Noisy Intermediate-Scale Quantum Devices
Lena Funcke, Tobias Hartung, Karl Jansen, Stefan Kühn, Manuel Schneider, Paolo Stornati, Xiaoyang Wang
Philosophical Transactions of the Royal Society A [ arXiv:2110.03809 ]

Abstract We review two algorithmic advances that bring us closer to reliable quantum simulations of model systems in high energy physics and beyond on noisy intermediate-scale quantum (NISQ) devices. The first method is the dimensional expressivity analysis of quantum circuits, which allows for constructing minimal but maximally expressive quantum circuits. The second method is an efficient mitigation of readout errors on quantum devices. Both methods can lead to significant improvements in quantum simulations, e.g., when variational quantum eigensolvers are used.

SymmetryGAN: Symmetry Discovery with Deep Learning
Krish Desai, Benjamin Nachman, Jesse Thaler
Physical. Rev. D, 2022, 105:096031 [ arXiv:2112.05722 ]

Abstract What are the symmetries of a dataset? Whereas the symmetries of an individual data element can be characterized by its invariance under various transformations, the symmetries of an ensemble of data elements are ambiguous due to Jacobian factors introduced while changing coordinates. In this paper, we provide a rigorous statistical definition of the symmetries of a dataset, which involves inertial reference densities, in analogy to inertial frames in classical mechanics. We then propose SymmetryGAN as a novel and powerful approach to automatically discover symmetries using a deep learning method based on generative adversarial networks (GANs). When applied to Gaussian examples, SymmetryGAN shows excellent empirical performance, in agreement with expectations from the analytic loss landscape. SymmetryGAN is then applied to simulated dijet events from the Large Hadron Collider (LHC) to demonstrate the potential utility of this method in high energy collider physics applications. Going beyond symmetry discovery, we consider procedures to infer the underlying symmetry group from empirical data.

PQ Axiverse
Mehmet Demirtas, Naomi Gendler, Cody Long, Liam McAllister, Jakob Moritz
Journal of High Energy Physics 2023, Volume 2023, Article number 92 [ arXiv:2112.04503 ]

Abstract We show that the strong CP problem is solved in a large class of compactifications of string theory. The Peccei-Quinn mechanism solves the strong CP problem if the CP-breaking effects of the ultraviolet completion of gravity and of QCD are small compared to the CP-preserving axion potential generated by low-energy QCD instantons. We characterize both classes of effects. To understand quantum gravitational effects, we consider an ensemble of flux compactifications of type IIB string theory on orientifolds of Calabi-Yau hypersurfaces in the geometric regime, taking a simple model of QCD on D7-branes. We show that the D-brane instanton contribution to the neutron electric dipole moment falls exponentially in N4, with N the number of axions. In particular, this contribution is negligible in all models in our ensemble with N>17. We interpret this result as a consequence of large N effects in the geometry that create hierarchies in instanton actions and also suppress the ultraviolet cutoff. We also compute the CP breaking due to high-energy instantons in QCD. In the absence of vectorlike pairs, we find contributions to the neutron electric dipole moment that are not excluded, but that could be accessible to future experiments if the scale of supersymmetry breaking is sufficiently low. The existence of vectorlike pairs can lead to a larger dipole moment. Finally, we show that a significant fraction of models are allowed by standard cosmological and astrophysical constraints.

Building Quantum Field Theories Out of Neurons
James Halverson
[ arXiv:2112.04527 ]

Abstract An approach to field theory is studied in which fields are comprised of N constituent random neurons. Gaussian theories arise in the infinite-N limit when neurons are independently distributed, via the Central Limit Theorem, while interactions arise due to finite-N effects or non-independently distributed neurons. Euclidean-invariant ensembles of neurons are engineered, with tunable two-point function, yielding families of Euclidean-invariant field theories. Some Gaussian, Euclidean invariant theories are reflection positive, which allows for analytic continuation to a Lorentz-invariant quantum field theory. Examples are presented that yield dual theories at infinite-N, but have different symmetries at finite-N. Landscapes of classical field configurations are determined by local maxima of parameter distributions. Predictions arise from mixed field-neuron correlators. Near-Gaussianity is exhibited at large-N, potentially explaining a feature of field theories in Nature.

Machine Learning in Nuclear Physics
Amber Boehnlein, Markus Diefenthaler, Nobuo Sato, Malachi Schram, Veronique Ziegler, Cristiano Fanelli, Morten Hjorth-Jensen, Tanja Horn, Michelle P. Kuchera, Dean Lee, Witold Nazarewicz, Peter Ostroumov, Kostas Orginos, Alan Poon, Xin-Nian Wang, Alexander Scheinker, Michael S. Smith, and Long-Gang Pang
Reviews of Modern Physics, 2022, Volume 94, Article 031003 [ arXiv:2112.02309 ]

Abstract Advances in machine learning methods provide tools that have broad applicability in scientific research. These techniques are being applied across the diversity of nuclear physics research topics, leading to advances that will facilitate scientific discoveries and societal applications. This Colloquium provides a snapshot of nuclear physics research, which has been transformed by machine learning techniques.

Infinite Neural Network Quantum States
Di Luo, James Halverson
Machine Learning: Science and Technology, 2023, Volume 4, Number 2 [ arXiv:2112.00723 ]

Abstract We study infinite limits of neural network quantum states (∞-NNQS), which exhibit representation power through ensemble statistics, and also tractable gradient descent dynamics. Ensemble averages of Renyi entropies are expressed in terms of neural network correlators, and architectures that exhibit volume-law entanglement are presented. A general framework is developed for studying the gradient descent dynamics of neural network quantum states (NNQS), using a quantum state neural tangent kernel (QS-NTK). For ∞-NNQS the training dynamics is simplified, since the QS-NTK becomes deterministic and constant. An analytic solution is derived for quantum state supervised learning, which allows an ∞-NNQS to recover any target wavefunction. Numerical experiments on finite and infinite NNQS in the transverse field Ising model and Fermi Hubbard model demonstrate excellent agreement with theory. ∞-NNQS opens up new opportunities for studying entanglement and training dynamics in other physics applications, such as in finding ground states.

Quantum reservoir computing using arrays of Rydberg atoms
Rodrigo Araiza Bravo, Khadijeh Najafi, Xun Gao, Susanne F. Yelin
[ arXiv:2111.10956 ]

Abstract Quantum computing promises to provide machine learning with computational advantages. However, noisy intermediate-scale quantum (NISQ) devices pose engineering challenges to realizing quantum machine learning (QML) advantages. Recently, a series of QML computational models inspired by the noise-tolerant dynamics on the brain have emerged as a means to circumvent the hardware limitations of NISQ devices. In this article, we introduce a quantum version of a recurrent neural network (RNN), a well-known model for neural circuits in the brain. Our quantum RNN (qRNN) makes use of the natural Hamiltonian dynamics of an ensemble of interacting spin-1/2 particles as a means for computation. In the limit where the Hamiltonian is diagonal, the qRNN recovers the dynamics of the classical version. Beyond this limit, we observe that the quantum dynamics of the qRNN provide it quantum computational features that can aid it in computation. To this end, we study a qRNN based on arrays of Rydberg atoms, and show that the qRNN is indeed capable of replicating the learning of several cognitive tasks such as multitasking, decision making, and long-term memory by taking advantage of several key features of this platform such as interatomic species interactions, and quantum many-body scars.

Surrogate- and invariance-boosted contrastive learning for data-scarce applications in science
Charlotte Loh, Thomas Christensen, Rumen Dangovski, Samuel Kim, Marin Soljačić
Nature Communications, 2022, Volume 13, Article 4223 [ arXiv:2110.08406 ]

Abstract Deep learning techniques have been increasingly applied to the natural sciences, e.g., for property prediction and optimization or material discovery. A fundamental ingredient of such approaches is the vast quantity of labelled data needed to train the model; this poses severe challenges in data-scarce settings where obtaining labels requires substantial computational or labor resources. Here, we introduce surrogate- and invariance-boosted contrastive learning (SIB-CL), a deep learning framework which incorporates three ``inexpensive'' and easily obtainable auxiliary information sources to overcome data scarcity. Specifically, these are: 1)~abundant unlabeled data, 2)~prior knowledge of symmetries or invariances and 3)~surrogate data obtained at near-zero cost. We demonstrate SIB-CL's effectiveness and generality on various scientific problems, e.g., predicting the density-of-states of 2D photonic crystals and solving the 3D time-independent Schrodinger equation. SIB-CL consistently results in orders of magnitude reduction in the number of labels needed to achieve the same network accuracies.

Pruning a restricted Boltzmann machine for quantum state reconstruction
Anna Golubeva, Roger G. Melko
Physical Review B, 2022, Volume 105, Article 125124 [ arXiv:2110.03676 ]

Abstract Restricted Boltzmann machines (RBMs) have proven to be a powerful tool for learning quantum wavefunction representations from qubit projective measurement data. Since the number of classical parameters needed to encode a quantum wavefunction scales rapidly with the number of qubits, the ability to learn efficient representations is of critical importance. In this paper we study magnitude-based pruning as a way to compress the wavefunction representation in an RBM, focusing on RBMs trained on data from the transverse-field Ising model in one dimension. We find that pruning can reduce the total number of RBM weights, but the threshold at which the reconstruction accuracy starts to degrade varies significantly depending on the phase of the model. In a gapped region of the phase diagram, the RBM admits pruning over half of the weights while still accurately reproducing relevant physical observables. At the quantum critical point however, even a small amount of pruning can lead to significant loss of accuracy in the physical properties of the reconstructed quantum state. Our results highlight the importance of tracking all relevant observables as their sensitivity varies strongly with pruning. Finally, we find that sparse RBMs are trainable and discuss how a successful sparsity pattern can be created without pruning.

Classical Shadows for Quantum Process Tomography on Near-term Quantum Computers
Ryan Levy, Di Luo, Bryan K. Clark
Physical Review Research, 2024, Volume 6, Issue 1 [ arXiv:2110.02965 ]

Abstract Quantum process tomography is a powerful tool for understanding quantum channels and characterizing properties of quantum devices. Inspired by recent advances using classical shadows in quantum state tomography [H.-Y. Huang, R. Kueng, and J. Preskill, Nat. Phys. 16, 1050 (2020).], we have developed ShadowQPT, a classical shadow method for quantum process tomography. We introduce two related formulations with and without ancilla qubits. ShadowQPT stochastically reconstructs the Choi matrix of the device allowing for an a-posteri classical evaluation of the device on arbitrary inputs with respect to arbitrary outputs. Using shadows we then show how to compute overlaps, generate all k-weight reduced processes, and perform reconstruction via Hamiltonian learning. These latter two tasks are efficient for large systems as the number of quantum measurements needed scales only logarithmically with the number of qubits. A number of additional approximations and improvements are developed including the use of a pair-factorized Clifford shadow and a series of post-processing techniques which significantly enhance the accuracy for recovering the quantum channel. We have implemented ShadowQPT using both Pauli and Clifford measurements on the IonQ trapped ion quantum computer for quantum processes up to n=4 qubits and achieved good performance.

Deep Set Auto Encoders for Anomaly Detection in Particle Physics
Bryan Ostdiek
SciPost Physics, 2022, Vol. 12, Issue 1 [ arXiv:2109.01695 ]

Abstract There is an increased interest in model agnostic search strategies for physics beyond the standard model at the Large Hadron Collider. We introduce a Deep Set Variational Autoencoder and present results on the Dark Machines Anomaly Score Challenge. We find that the method attains the best anomaly detection ability when there is no decoding step for the network, and the anomaly score is based solely on the representation within the encoded latent space. This method was one of the top-performing models in the Dark Machines Challenge, both for the open data sets as well as the blinded data sets.

Real-time lattice gauge theory actions: unitarity, convergence, and path integral contour deformations
Gurtej Kanwar, Michael L. Wagman
Physical Review D, Volume 104, Article 014513 [ arXiv:2103.02602 ]

Abstract The Wilson action for Euclidean lattice gauge theory defines a positive-definite transfer matrix that corresponds to a unitary lattice gauge theory time-evolution operator if analytically continued to real time. Hoshina, Fujii, and Kikukawa (HFK) recently pointed out that applying the Wilson action discretization to continuum real-time gauge theory does not lead to this, or any other, unitary theory and proposed an alternate real-time lattice gauge theory action that does result in a unitary real-time transfer matrix. The character expansion defining the HFK action is divergent, and in this work we apply a path integral contour deformation to obtain a convergent representation for U(1) HFK path integrals suitable for numerical Monte Carlo calculations. We also introduce a class of real-time lattice gauge theory actions based on analytic continuation of the Euclidean heat-kernel action. Similar divergent sums are involved in defining these actions, but for one action in this class this divergence takes a particularly simple form, allowing construction of a path integral contour deformation that provides absolutely convergent representations for U(1) and SU(N) real-time lattice gauge theory path integrals. We perform proof-of-principle Monte Carlo calculations of real-time U(1) and SU(3) lattice gauge theory and verify that exact results for unitary time evolution of static quark-antiquark pairs in (1 + 1)D are reproduced.

Deep multi-task mining Calabi-Yau four-folds
Harold Erbin, Riccardo Finotello, Robin Schneider, Mohamed Tamaazousti
Machine Learning: Science and Technology, 2021, Volume 3, Number 1 [ arXiv:2108.02221 ]

Abstract We continue earlier efforts in computing the dimensions of tangent space cohomologies of Calabi-Yau manifolds using deep learning. In this paper, we consider the dataset of all Calabi-Yau four-folds constructed as complete intersections in products of projective spaces. Employing neural networks inspired by state-of-the-art computer vision architectures, we improve earlier benchmarks and demonstrate that all four non-trivial Hodge numbers can be learned at the same time using a multi-task architecture. With 30% (80%) training ratio, we reach an accuracy of 100% for h(1,1) and 97% for h(2,1) (100% for both), 81% (96%) for h(3,1), and 49% (83%) for h(2,2). Assuming that the Euler number is known, as it is easy to compute, and taking into account the linear constraint arising from index computations, we get 100% total accuracy.

Nonperturbative renormalization for the neural network–QFT correspondence
Harold Erbin, Vincent Lahoche, Dine Ousmane Samary
Machine Learning Science and Technology, 2022, Volume 3, Number 1, Article 015027 [ arXiv:2108.01403 ]

Abstract In a recent work~[1], Halverson, Maiti and Stoner proposed a description of neural networks in terms of a Wilsonian effective field theory. The infinite-width limit is mapped to a free field theory, while finite N corrections are taken into account by interactions (non-Gaussian terms in the action). In this paper, we study two related aspects of this correspondence. First, we comment on the concepts of locality and power-counting in this context. Indeed, these usual space-time notions may not hold for neural networks (since inputs can be arbitrary), however, the renormalization group provides natural notions of locality and scaling. Moreover, we comment on several subtleties, for example, that data components may not have a permutation symmetry: in that case, we argue that random tensor field theories could provide a natural generalization. Second, we improve the perturbative Wilsonian renormalization from~[1] by providing an analysis in terms of the nonperturbative renormalization group using the Wetterich-Morris equation. An important difference with usual nonperturbative RG analysis is that only the effective (IR) 2-point function is known, which requires setting the problem with care. Our aim is to provide a useful formalism to investigate neural networks behavior beyond the large-width limit (i.e.~far from Gaussian limit) in a nonperturbative fashion. A major result of our analysis is that changing the standard deviation of the neural network weight distribution can be interpreted as a renormalization flow in the space of networks. We focus on translations invariant kernels and provide preliminary numerical results.

Neural Conditional Reweighting
Benjamin Nachman, Jesse Thaler
Physical Review D, Volume 105, Article 076015 [ arXiv:2107.08979 ]

Abstract There is a growing use of neural network classifiers as unbinned, high-dimensional (and variable-dimensional) reweighting functions. To date, the focus has been on marginal reweighting, where a subset of features are used for reweighting while all other features are integrated over. There are some situations, though, where it is preferable to condition on auxiliary features instead of marginalizing over them. In this paper, we introduce neural conditional reweighting, which extends neural marginal reweighting to the conditional case. This approach is particularly relevant in high-energy physics experiments for reweighting detector effects conditioned on particle-level truth information. We leverage a custom loss function that not only allows us to achieve neural conditional reweighting through a single training procedure, but also yields sensible interpolation even in the presence of phase space holes. As a specific example, we apply neural conditional reweighting to the energy response of high-energy jets, which could be used to improve the modeling of physics objects in parametrized fast simulation packages.

Flow-based sampling for multimodal distributions in lattice field theory
Daniel C. Hackett, Chung-Chun Hsieh, Michael S. Albergo, Denis Boyda, Jiunn-Wei Chen, Kai-Feng Chen, Kyle Cranmer, Gurtej Kanwar, Phiala E. Shanahan
[ arXiv:2107.00734 ]

Abstract Recent results have demonstrated that samplers constructed with flow-based generative models are a promising new approach for configuration generation in lattice field theory. In this paper, we present a set of methods to construct flow models for targets with multiple separated modes (i.e. theories with multiple vacua). We demonstrate the application of these methods to modeling two-dimensional real scalar field theory in its symmetry-broken phase. In this context we investigate the performance of different flow-based sampling algorithms, including a composite sampling algorithm where flow-based proposals are occasionally augmented by applying updates using traditional algorithms like HMC.

Single electrons on solid neon as a solid-state quit platform
Xianjing Zhou, Gerwin Koolstra, Xufeng Zhang, Ge Yang, Xu Han, Brennan Dizdar, Divan Ralu, Wei Guo, Kater W. Murch, David I. Shuster, Dafei Jin
Nature, 2022, 605, 46-50 [ arXiv:2106.10326 ]

Abstract Progress toward the realization of quantum computers requires persistent advances in their constituent building blocks - qubits. Novel qubit platforms that simultaneously embody long coherence, fast operation, and large scalability offer compelling advantages in the construction of quantum computers and many other quantum information systems. Electrons, ubiquitous elementary particles of nonzero charge, spin, and mass, have commonly been perceived as paradigmatic local quantum information carriers. Despite superior controllability and configurability, their practical performance as qubits via either motional or spin states depends critically on their material environment. Here we report our experimental realization of a new qubit platform based upon isolated single electrons trapped on an ultraclean solid neon surface in vacuum. By integrating an electron trap in a circuit quantum electrodynamics architecture, we achieve strong coupling between the motional states of a single electron and a single microwave photon in an on-chip superconducting resonator. Qubit gate operations and dispersive readout are implemented to measure the energy relaxation time T1 of 15 μs and phase coherence time T2 over 200 ns. These results indicate that the electron-on-solid-neon qubit already performs near the state of the art as a charge qubit.

Flow-based sampling for fermionic lattice field theories
Michael S. Albergo, Gurtej Kanwar, Sébastien Racanière, Danilo J. Rezende, Julian M. Urban, Denis Boyda, Kyle Cranmer, Daniel C. Hackett, Phiala E. Shanahan
Physical Review D, 2021, Vol. 104, Iss. 11 – 1 [ arXiv:2106.05934 ]

Abstract Algorithms based on normalizing flows are emerging as promising machine learning approaches to sampling complicated probability distributions in a way that can be made asymptotically exact. In the context of lattice field theory, proof-of-principle studies have demonstrated the effectiveness of this approach for scalar theories, gauge theories, and statistical systems. This work develops approaches that enable flow-based sampling of theories with dynamical fermions, which is necessary for the technique to be applied to lattice field theory studies of the Standard Model of particle physics and many condensed matter systems. As a practical demonstration, these methods are applied to the sampling of field configurations for a two-dimensional theory of massless staggered fermions coupled to a scalar field via a Yukawa interaction.

Symmetry-via-Duality: Invariant Neural Network Densities from Parameter-Space Correlators
Anindita Maiti, Keegan Stoner, James Halverson
[ arXiv:2106.00694 ]

Abstract Parameter-space and function-space provide two different duality frames in which to study neural networks. We demonstrate that symmetries of network densities may be determined via dual computations of network correlation functions, even when the density is unknown and the network is not equivariant. Symmetry-via-duality relies on invariance properties of the correlation functions, which stem from the choice of network parameter distributions. Input and output symmetries of neural network densities are determined, which recover known Gaussian process results in the infinite width limit. The mechanism may also be utilized to determine symmetries during training, when parameters are correlated, as well as symmetries of the Neural Tangent Kernel. We demonstrate that the amount of symmetry in the initialization density affects the accuracy of networks trained on Fashion-MNIST, and that symmetry breaking helps only when it is in the direction of ground truth.

Preserving New Physics while Simultaneously Unfolding All Observables
Patrick Komiske, W. Patrick McCormack, Benjamin Nachman
Physical Review D, Volume 104, Article 076027 [ arXiv:2105.09923 ]

Abstract Direct searches for new particles at colliders have traditionally been factorized into model proposals by theorists and model testing by experimentalists. With the recent advent of machine learning methods that allow for the simultaneous unfolding of all observables in a given phase space region, there is a new opportunity to blur these traditional boundaries by performing searches on unfolded data. This could facilitate a research program where data are explored in their natural high dimensionality with as little model bias as possible. We study how the information about physics beyond the Standard Model is preserved by full phase space unfolding using an important physics target at the Large Hadron Collider (LHC): exotic Higgs boson decays involving hadronic final states. We find that if the signal cross section is high enough, information about the new physics is visible in the unfolded data. We will show that in some cases, quantifiably all of the information about the new physics is encoded in the unfolded data. Finally, we show that there are still many cases when the unfolding does not work fully or precisely, such as when the signal cross section is small. This study will serve as an important benchmark for enhancing unfolding methods for the LHC and beyond.

Modern Machine Learning and Particle Physics
Matthew D. Schwartz
Harvard Data Science Review, 2021, Issue 3.2, 13 May [ arXiv:2103.12226 ]

Abstract Over the past five years, modern machine learning has been quietly revolutionizing particle physics. Old methodology is being outdated and entirely new ways of thinking about data are becoming commonplace. This article will review some aspects of the natural synergy between modern machine learning and particle physics, focusing on applications at the Large Hadron Collider. A sampling of examples is given, from signal/background discrimination tasks using supervised learning to direct data-driven approaches. Some comments on persistent challenges and possible future directions for the field are included at the end.

Topological obstructions to autoencoding
Joshua Batson, C. Grace Haaf, Yonatan Kahn, Daniel A. Roberts
Journal of High Energy Physics, 2021, Issue 4, Article 280 [ arXiv:2102.08380 ]

Abstract Autoencoders have been proposed as a powerful tool for model-independent anomaly detection in high-energy physics. The operating principle is that events which do not belong to the space of training data will be reconstructed poorly, thus flagging them as anomalies. We point out that in a variety of examples of interest, the connection between large reconstruction error and anomalies is not so clear. In particular, for data sets with nontrivial topology, there will always be points that erroneously seem anomalous due to global issues. Conversely, neural networks typically have an inductive bias or prior to locally interpolate such that undersampled or rare events may be reconstructed with small error, despite actually being the desired anomalies. Taken together, these facts are in tension with the simple picture of the autoencoder as an anomaly detector. Using a series of illustrative low-dimensional examples, we show explicitly how the intrinsic and extrinsic topology of the dataset affects the behavior of an autoencoder and how this topology is manifested in the latent space representation during training. We ground this analysis in the discussion of a mock "bump hunt" in which the autoencoder fails to identify an anomalous "signal" for reasons tied to the intrinsic topology of n-particle phase space.

Few-nucleon matrix elements in pionless effective field theory in a finite volume
W. Detmold and P. E. Shanahan
Physical Review D, Volume 103, Article 074503 [ arXiv:2102.04329 ]

Abstract Pionless effective field theory in a finite volume (FVEFTπ/) is investigated as a framework for the analysis of multi-nucleon spectra and matrix elements calculated in lattice QCD (LQCD). By combining FVEFTπ/ with the stochastic variational method, the spectra of nuclei with atomic number A∈{2,3} are matched to existing finite-volume LQCD calculations at heavier-than-physical quark masses corresponding to a pion mass mπ=806 MeV, thereby enabling infinite-volume binding energies to be determined using infinite-volume variational calculations. Based on the variational wavefunctions that are constructed in this approach, the finite-volume matrix elements of various local operators are computed in FVEFTπ/ and matched to LQCD calculations of the corresponding QCD operators in the same volume, thereby determining the relevant one and two-body EFT counterterms and enabling an extrapolation of the LQCD matrix elements to infinite volume. As examples, the scalar, tensor, and axial matrix elements are considered, as well as the magnetic moments and the isovector longitudinal momentum fraction.

Path integral contour deformations for observables in SU(N) gauge theory
William Detmold, Gurtej Kanwar, Henry Lamm, Michael L. Wagman, Neill C. Warrington
Physical Review D, 2021, Vol. 103, Issue 9, Article 094517 [ arXiv:2101.12668 ]

Abstract Path integral contour deformations have been shown to mitigate sign and signal-to-noise problems associated with phase fluctuations in lattice field theories. We define a family of contour deformations applicable to SU(N) lattice gauge theory that can reduce sign and signal-to-noise problems associated with complex actions and complex observables. For observables, these contours can be used to define deformed observables with identical expectation value but different variance. As a proof-of-principle, we apply machine learning techniques to optimize the deformed observables associated with Wilson loops in two dimensional SU(2) and SU(3) gauge theory. We study loops consisting of up to 64 plaquettes and achieve variance reduction of up to 4 orders of magnitude.

Introduction to Normalizing Flows for Lattice Field Theory
Michael S. Albergo, Denis Boyda, Daniel C. Hackett, Gurtej Kanwar, Kyle Cranmer, Sébastien Racanière, Danilo Jimenez Rezende, and Phiala E. Shanahan
[ arXiv:2101.08176 ]

Abstract This notebook tutorial demonstrates a method for sampling Boltzmann distributions of lattice field theories using a class of machine learning models known as normalizing flows. The ideas and approaches proposed in arXiv:1904.12072, arXiv:2002.02428, and arXiv:2003.06413 are reviewed and a concrete implementation of the framework is presented. We apply this framework to a lattice scalar field theory and to U(1) gauge theory, explicitly encoding gauge symmetries in the flow-based approach to the latter. This presentation is intended to be interactive and working with the attached Jupyter notebook is recommended.

Learning to Unknot
Sergei Gukov, James Halverson, Fabian Ruehle, and Piotr Sułkowski
Machine Learning - Science and Technology, 2021, Volume 2, Number 2, Article 025035 [ arXiv:2010.16263 ]

Abstract We introduce natural language processing into the study of knot theory, as made natural by the braid word representation of knots. We study the UNKNOT problem of determining whether or not a given knot is the unknot. After describing an algorithm to randomly generate $N$-crossing braids and their knot closures and discussing the induced prior on the distribution of knots, we apply binary classification to the UNKNOT decision problem. We find that the Reformer and shared-QK Transformer network architectures outperform fully-connected networks, though all perform well. Perhaps surprisingly, we find that accuracy increases with the length of the braid word, and that the networks learn a direct correlation between the confidence of their predictions and the degree of the Jones polynomial. Finally, we utilize reinforcement learning (RL) to find sequences of Markov moves and braid relations that simplify knots and can identify unknots by explicitly giving the sequence of unknotting actions. Trust region policy optimization (TRPO) performs consistently well for a wide range of crossing numbers and thoroughly outperformed other RL algorithms and random walkers. Studying these actions, we find that braid relations are more useful in simplifying to the unknot than one of the Markov moves.

Elliptic stable envelopes and hypertoric loop spaces
Michael McBreen, Artan Sheshmani, Shing-Tung Yau
Selecta Mathematica, 2023, Volume 29, Article number 73 [ arXiv:2010.0067 ]

Abstract This paper relates the elliptic stable envelopes of a hypertoric variety X with the K-theoretic stable envelopes of the loop hypertoric space, ℒ˜X. It thus points to a possible categorification of elliptic stable envelopes.