IAIFI Experimental Physics Papers

Experimental Physics

GW-YOLO: Multi-transient segmentation in LIGO using computer vision
Siddharth Soni, Nikhil Mukund, Erik Katsavounidis
[ arXiv:2508.17399 ]

Abstract

Time series data and their time-frequency representation from gravitational-wave interferometers present multiple opportunities for the use of artificial intelligence methods associated with signal and image processing. Closely connected with this is the real-time aspect associated with gravitational-wave interferometers and the astrophysical observations they perform; the discovery potential of these instruments can be significantly enhanced when data processing can be achieved in O(1s) timescales. In this work, we introduce a novel signal and noise identification tool based on the YOLO (You Only Look Once) object detection framework. For its application into gravitational waves, we will refer to it as GW-YOLO. This tool can provide scene identification capabilities and essential information regarding whether an observed transient is any combination of noise and signal. Additionally, it supplies detailed time-frequency coordinates of the detected objects in the form of pixel masks, an essential property that can be used to understand and characterize astrophysical sources, as well as instrumental noise. The simultaneous identification of noise and signal, combined with precise pixel-level localization, represents a significant advancement in gravitational-wave data analysis. Our approach yields a 50% detection efficiency for binary black hole signals at a signal-to-noise ratio (SNR) of 15 when such signals overlap with transient noise artifacts. When noise artifacts overlap with binary neutron star signals, our algorithm attains 50% detection efficiency at an SNR of 30. This presents the first quantitative assessment of the ability to detect astrophysical events overlapping with realistic, instrument noise present in gravitational-wave interferometers.

Frequentist Uncertainties on Neural Density Ratios with wifi Ensembles
Sean Benevedes, Jesse Thaler
[ arXiv:2506.00113 | code ]

Abstract

We introduce wifi ensembles as a novel framework to obtain asymptotic frequentist uncertainties on density ratios, with a particular focus on neural ratio estimation in the context of high-energy physics. When the density ratio of interest is a likelihood ratio conditioned on parameters, wifi ensembles can be used to perform simulation-based inference on those parameters. After training the basis functions f_i(x), uncertainties on the weights w_i can be straightforwardly propagated to the estimated parameters without requiring extraneous bootstraps. To demonstrate this approach, we present an application in quantum chromodynamics at the Large Hadron Collider, using wifi ensembles to estimate the likelihood ratio between generated quark and gluon jets. We use this learned likelihood ratio to estimate the quark fraction in a synthetic mixed quark/gluon sample, showing that the resultant uncertainties empirically satisfy the desired coverage properties.

Fast Low Energy Reconstruction using Convolutional Neural Networks
IceCube Collaboration
[ arXiv:2505.16777 ]

Abstract

IceCube is a Cherenkov detector instrumenting over a cubic kilometer of glacial ice deep under the surface of the South Pole. The DeepCore sub-detector lowers the detection energy threshold to a few GeV, enabling the precise measurements of neutrino oscillation parameters with atmospheric neutrinos. The reconstruction of neutrino interactions inside the detector is essential in studying neutrino oscillations. It is particularly challenging to reconstruct sub-100 GeV events with the IceCube detectors due to the relatively sparse detection units and detection medium. Convolutional neural networks (CNNs) are broadly used in physics experiments for both classification and regression purposes. This paper discusses the CNNs developed and employed for the latest IceCube-DeepCore oscillation measurements. These CNNs estimate various properties of the detected neutrinos, such as their energy, direction of arrival, interaction vertex position, flavor-related signature, and are also used for background classification.

Interpretable Artificial Intelligence for Topological Photonics
Ali Ghorashi, Sachin Vaidya, Ziming Liu, Charlotte Loh, Thomas Christensen, Max Tegmark, Marin Soljačić
[ arXiv:2505.10485 ]

Abstract

Topological photonic crystals (PhCs) offer robust, disorder-resistant modes engendered by nontrivial band symmetries, but designing PhCs with prescribed topological band properties remains a challenge due to the complexity involved in mapping the continuous real-space design landscape of photonic crystals to the discrete output space of band topology. Here, we introduce a new approach to address this problem, employing Kolmogorov--Arnold networks (KANs) to predict and inversely design the band symmetries of two-dimensional PhCs with two-fold rotational (C2) symmetry. We show that a single-hidden-layer KAN, trained on a dataset of C2-symmetric unit cells, achieves 99% accuracy in classifying the symmetry eigenvalues of the lowest transverse-magnetic band. For interpretability, we use symbolic regression to extract eight algebraic formulas that characterize the band symmetry classes in terms of the Fourier components of the dielectric function. These formulas not only retain the full predictive power of the network but also enable deterministic inverse design. Applying them, we generate 2,000 photonic crystals with target band symmetries, achieving over 92% accuracy even for high-contrast, experimentally realizable structures beyond the training domain.

AI-Driven Robotics for Free-Space Optics
Shiekh Zia Uddin, Sachin Vaidya, Shrish Choudhary, Zhuo Chen, Raafat K. Salib, Luke Huang, Dirk R. Englund, Marin Soljačić
[ arXiv:2505.17985 ]

Abstract

Tabletop optical experiments are foundational to research in many areas of science, including photonics, quantum optics, materials science, metrology, and biomedical imaging. However these experiments remain fundamentally reliant on manual design, assembly, and alignment, limiting throughput and reproducibility. Optics currently lacks generalizable robotic systems capable of operating across a diverse range of setups in realistic laboratory environments. Here we present OptoMate, an autonomous platform that integrates generative AI, computer vision, and precision robotics to enable automation of free-space optics experiments. Our platform interprets user-defined goals to generate valid optical setups using a fine-tuned large language model (LLM), assembles the setup via robotic pick-and-place with sub-millimeter accuracy, and performs fine alignment using a robot-deployable tool. The system then executes a range of automated measurements, including laser beam characterization, polarization mapping, and spectroscopy tasks. This work demonstrates the first flexible, AI-driven automation platform for optics, offering a path toward remote operation, cloud labs, and high-throughput discovery in the optical sciences.

Anomaly preserving contrastive neural embeddings for end-to-end model-independent searches at the LHC
Kyle Metzger, Lana Xu, Mia Sodini, Thea K. Arrestad, Katya Govorkova, Gaia Grosso, Philip Harris
[ arXiv:2502.15926 ]

Abstract

Anomaly detection -- identifying deviations from Standard Model predictions -- is a key challenge at the Large Hadron Collider due to the size and complexity of its datasets. This is typically addressed by transforming high-dimensional detector data into lower-dimensional, physically meaningful features. We tackle feature extraction for anomaly detection by learning powerful low-dimensional representations via contrastive neural embeddings. This approach preserves potential anomalies indicative of new physics and enables rare signal extraction using novel machine learning-based statistical methods for signal-independent hypothesis testing. We compare supervised and self-supervised contrastive learning methods, for both MLP- and Transformer-based neural embeddings, trained on the kinematic observables of physics objects in LHC collision events. The learned embeddings serve as input representations for signal-agnostic statistical detection methods in inclusive final states, achieving over ten fold improved detection performance over the original feature representation and up to four fold improvement over using a physics-informed selections of the same dimensionality. We achieve significant improvement in discovery power for both rare new physics signals and rare Standard Model processes across diverse final states, demonstrating its applicability for efficiently searching for diverse signals simultaneously. We show that the optimal representation for background classification does not always maximize sensitivity to new physics signals, revealing an inherent trade-off between background structure preservation and anomaly enhancement. Our findings demonstrate that foundation models for particle physics data hold significant potential for improving neural feature extraction, enabling scientific discovery in inclusive final states at collider experiments.

Isolating Unisolated Upsilons with Anomaly Detection in CMS Open Data
Rikab Gambhir, Radha Mastandrea, Benjamin Nachman, Jesse Thaler
Physical Review Letters, 2025, Volume 135, Issue 2 [ arXiv:2502.14036 | code ]

Abstract

We present the first study of anti-isolated Upsilon decays to two muons (Υ→μ+μ−) in proton-proton collisions at the Large Hadron Collider. Using a machine learning (ML)-based anomaly detection strategy, we 'rediscover' the Υ in 13 TeV CMS Open Data from 2016, despite overwhelming anti-isolated backgrounds. We elevate the signal significance to 6.4σ using these methods, starting from 1.6σ using the dimuon mass spectrum alone. Moreover, we demonstrate improved sensitivity from using an ML-based estimate of the multi-feature likelihood compared to traditional 'cut-and-count' methods. Our work demonstrates that it is possible and practical to find real signals in experimental collider data using ML-based anomaly detection, and we distill a readily-accessible benchmark dataset from the CMS Open Data to facilitate future anomaly detection developments.

Lake- and Surface-Based Detectors for Forward Neutrino Physics
Nicholas W. Kamp, Carlos A. Argüelles, Albrecht Karle, Jennifer Thomas, Tianlu Yuan
[ arXiv:2501.08278 ]

Abstract

We propose two medium-baseline, kiloton-scale neutrino experiments to study neutrinos from LHC proton-proton collisions: SINE, a surface-based scintillator panel detector observing muon neutrinos from the CMS interaction point, and UNDINE, a water Cherenkov detector submerged in lake Geneva observing all-flavor neutrinos from LHCb. Using a Monte Carlo simulation, we estimate millions of neutrino interactions during the high-luminosity LHC era. We show that these datasets can constrain neutrino cross sections, charm production in pp collisions, and strangeness enhancement as a solution to the cosmic-ray muon puzzle. SINE and UNDINE thus offer a cost-effective medium-baseline complement to the proposed short-baseline forward physics facility.

Robust resonant anomaly detection with NPLM
Gaia Grosso, Debajyoti Sengupta, Tobias Golling, Philip Harris
[ arXiv:2501.01778 ]

Abstract

In this study, we investigate the application of the New Physics Learning Machine (NPLM) algorithm as an alternative to the standard CWoLa method with Boosted Decision Trees (BDTs), particularly for scenarios with rare signal events. NPLM offers an end-to-end approach to anomaly detection and hypothesis testing by utilizing an in-sample evaluation of a binary classifier to estimate a log-density ratio, which can improve detection performance without prior assumptions on the signal model. We examine two approaches: (1) a end-to-end NPLM application in cases with reliable background modelling and (2) an NPLM-based classifier used for signal selection when accurate background modelling is unavailable, with subsequent performance enhancement through a hyper-test on multiple values of the selection threshold. Our findings show that NPLM-based methods outperform BDT-based approaches in detection performance, particularly in low signal injection scenarios, while significantly reducing epistemic variance due to hyperparameter choices. This work highlights the potential of NPLM for robust resonant anomaly detection in particle physics, setting a foundation for future methods that enhance sensitivity and consistency under signal variability.

Constraints and Sensitivities for Dipole-Portal Heavy Neutral Leptons from ND280 and its Upgrade
Ming-Shau Liu, Nicholas Kamp, Carlos A. Argüelles
[ arXiv:2412.15051 ]

Abstract

We report new constraints and sensitivities to heavy neutral leptons (HNLs) with transition magnetic moments, also known as dipole-portal HNLs. This is accomplished using data from the T2K ND280 near detector in addition to the projected three-year dataset of the upgraded ND280 detector. Dipole-portal HNLs have been extensively studied in the literature and offer a potential explanation for the 4.8σ MiniBooNE anomaly. To perform our analysis, we simulate HNL decays to e+e− pairs in the gaseous time projection chambers of the ND280 detector and its upgrade. Recasting an ND280 search for mass-mixed HNLs, we find that ND280 data places world-leading constraints on dipole-portal HNLs in the 390-743 MeV mass range, disfavoring the region of parameter space favored by the MiniBooNE anomaly. The addition of three years of ND280 upgrade data will be able to disfavor the MiniBooNE solution at the 5σ confidence level and extend the world-leading constraints to dipole-portal HNLs in the 148-860 MeV mass range. Our analysis suggests that ND280 data excludes dipole-portal HNLs as a solution to the MiniBooNE excess, motivating a dedicated search within the T2K collaboration and potentially highlighting the need for alternative explanations for the MiniBooNE anomaly.

The Neutrino Slice at Muon Colliders
Luc Bojorquez-Lopez, Matheus Hostert, Carlos A. Argüelles, Zhen Liu
[ arXiv:2412.14115 | code ]

Abstract

Muon colliders provide an exciting new path pushing forward the energy frontier of particle physics. We point out a new use of these facilities for neutrino physics and beyond the Standard Model physics using their main detectors. Muon decays along the main accelerator rings induce an intense, highly collimated beam of neutrinos. As this beam crosses a thin slice of the kt-scale detector, it would induce unprecedented numbers of neutrino interactions, with O(104) events per second for a 10 TeV µ+µ− collider. We characterize these events, showing that they are highly energetic and possess a distinct timing signature with a large transverse displacement. We discuss promising applications of these events for instrumentation, electroweak, and beyond-the-Standard Model physics. For instance, we show that a sub-percent measurement of the neutrino-electron scattering rate enables new precision measurements of the Weak angle and a novel detection of the neutrino charge radius.

Product Manifold Machine Learning for Physics
Nathaniel S. Woodward, Sang Eon Park, Gaia Grosso, Jeffrey Krupa, Philip Harris
[ arXiv:2412.07033 ]

Abstract

Physical data are representations of the fundamental laws governing the Universe, hiding complex compositional structures often well captured by hierarchical graphs. Hyperbolic spaces are endowed with a non-Euclidean geometry that naturally embeds those structures. To leverage the benefits of non-Euclidean geometries in representing natural data we develop machine learning on PM spaces, Cartesian products of constant curvature Riemannian manifolds. As a use case we consider the classification of “jets”, sprays of hadrons and other subatomic particles produced by the hadronization of quarks and gluons in collider experiments. We compare the performance of PM-MLP and PM-Transformer models across several possible representations. Our experiments show that PM representations generally perform equal or better to fully Euclidean models of similar size, with the most significant gains found for highly hierarchical jets and small models. We discover significant correlation between the degree of hierarchical structure at a per-jet level and classification performance with the PM-Transformer in top tagging benchmarks. This is a promising result highlighting a potential direction for further improving machine learning model performance through tailoring geometric representation at a per-sample level in hierarchical datasets. These results reinforce the view of geometric representation as a key parameter in maximizing both performance and efficiency of machine learning on natural data.

SymbolFit: Automatic Parametric Modeling with Symbolic Regression
Ho Fung Tsoi, Dylan Rankin, Cecile Caillol, Miles Cranmer, Sridhara Dasu, Javier Duarte, Philip Harris, Elliot Lipeles, Vladimir Loncar
[ arXiv:2411.09851 ]

Abstract

We introduce SymbolFit, a framework that automates parametric modeling by using symbolic regression to perform a machine-search for functions that fit the data while simultaneously providing uncertainty estimates in a single run. Traditionally, constructing a parametric model to accurately describe binned data has been a manual and iterative process, requiring an adequate functional form to be determined before the fit can be performed. The main challenge arises when the appropriate functional forms cannot be derived from first principles, especially when there is no underlying true closed-form function for the distribution. In this work, we develop a framework that automates and streamlines the process by utilizing symbolic regression, a machine learning technique that explores a vast space of candidate functions without requiring a predefined functional form because the functional form itself is treated as a trainable parameter, making the process far more efficient and effortless than traditional regression methods. We demonstrate the framework in high-energy physics experiments at the CERN Large Hadron Collider (LHC) using five real proton-proton collision datasets from new physics searches, including background modeling in resonance searches for high-mass dijet, trijet, paired-dijet, diphoton, and dimuon events. We show that our framework can flexibly and efficiently generate a wide range of candidate functions that fit a nontrivial distribution well using a simple fit configuration that varies only by random seed, and that the same fit configuration, which defines a vast function space, can also be applied to distributions of different shapes, whereas achieving a comparable result with traditional methods would have required extensive manual effort.

A Lorentz-Equivariant Transformer for All of the LHC
Johann Brehmer, Víctor Bresó, Pim de Haan, Tilman Plehn, Huilin Qu, Jonas Spinner, Jesse Thaler
[ arXiv:2411.00446 | code ]

Abstract

We show that the Lorentz-Equivariant Geometric Algebra Transformer (L-GATr) yields state-of-the-art performance for a wide range of machine learning tasks at the Large Hadron Collider. L-GATr represents data in a geometric algebra over space-time and is equivariant under Lorentz transformations. The underlying architecture is a versatile and scalable transformer, which is able to break symmetries if needed. We demonstrate the power of L-GATr for amplitude regression and jet classification, and then benchmark it as the first Lorentz-equivariant generative network. For all three LHC tasks, we find significant improvements over previous architectures.

Learning Efficient Representations of Neutrino Telescope Events
Felix J. Yu, Nicholas Kamp, Carlos A. Argüelles
[ arXiv:2410.13148 | code ]

Abstract

Neutrino telescopes detect rare interactions of particles produced in some of the most extreme environments in the Universe. This is accomplished by instrumenting a cubic-kilometer volume of naturally occurring transparent medium with light sensors. Given their substantial size and the high frequency of background interactions, these telescopes amass an enormous quantity of large variance, high-dimensional data. These attributes create substantial challenges for analyzing and reconstructing interactions, particularly when utilizing machine learning (ML) techniques. In this paper, we present a novel approach, called om2vec, that employs transformer-based variational autoencoders to efficiently represent neutrino telescope events by learning compact and descriptive latent representations. We demonstrate that these latent representations offer enhanced flexibility and improved computational efficiency, thereby facilitating downstream tasks in data analysis.

Optimal Quantum Purity Amplification
Zhaoyi Li, Honghao Fu, Takuya Isogawa, Isaac Chuang
[ arXiv:2409.18167 | code ]

Abstract

Quantum purity amplification (QPA) offers a novel approach to counteracting the pervasive noise that degrades quantum states. We present the optimal QPA protocol for general quantum systems against global depolarizing noise, which has remained unknown for two decades. We construct and prove the optimality of our protocol, which demonstrates improved fidelity scaling compared to the best-known methods. We explore the operational interpretation of the protocol and provide simple examples of how to compile it into efficient circuits for near-term experiments. Furthermore, we conduct numerical simulations to investigate the effectiveness of our protocol in the quantum simulation of Hamiltonian evolution, demonstrating its ability to enhance fidelity even under circuit-level noise. Our findings suggest that QPA could improve the performance of quantum information processing tasks, particularly in the context of Noisy Intermediate-Scale Quantum (NISQ) devices, where reducing the effect of noise with limited resources is critical.

Double ‘acct’: a distinct double-peaked supernova matching pulsational pair-instability models
C. R. Angus, S. E. Woosley, R. J. Foley, M. Nicholl, V. A. Villar, K. Taggart, M. Pursiainen, P. Ramsden, S. Srivastav, H. F. Stevance, T. Moore, K. Auchettl, W. B. Hoogendam, N. Khetan, S. K. Yadavalli, G. Dimitriadis, A. Gagliano, M. R. Siebert, A. Aamer, T. de Boer, K. C. Chambers, A. Clocchiatti, D. A. Coulter, M. R. Drout, D. Farias, M. D. Fulton, C. Gall, H. Gao, L. Izzo, D. O. Jones, C.-C. Lin, E. A. Magnier, G. Narayan, E. Ramirez-Ruiz, C. L. Ransome, A. Rest, S. J. Smartt, K. W. Smith
[ arXiv:2409.02174 ]

Abstract

We present multi-wavelength data of SN2020acct, a double-peaked stripped-envelope supernova (SN) in NGC2981 at ~150 Mpc. The two peaks are temporally distinct, with maxima separated by 58 rest-frame days, and a factor of 20 reduction in flux between. The first is luminous (Mr = -18.00 ± 0.02 mag), blue (g - r = 0.27 ± 0.03 mag), and displays spectroscopic signatures of interaction with hydrogen-free circumstellar material. The second peak is fainter (Mr = -17.29 ± 0.03 mag), and spectroscopically similar to an evolved stripped-envelope SNe, with strong blended forbidden [Ca II] and [O II] features. No other known double-peak SN exhibits a light curve similar to that of SN 2020acct. We find the likelihood of two individual SNe occurring in the same star-forming region within that time to be highly improbable, while an implausibly fine-tuned configuration would be required to produce two SNe from a single binary system. We find that the peculiar properties of SN2020acct match models of pulsational pair instability (PPI), in which the initial peak is produced by collisions of shells of ejected material, shortly followed by a terminal explosion. Pulsations from a star with a 72 M⊙ helium core provide an excellent match to the double-peaked light curve. The local galactic environment has a metallicity of 0.4 Z⊙, a level where massive single stars are not expected retain enough mass to encounter the PPI. However, late binary mergers or a low-metallicity pocket may allow the required core mass. We measure the rate of SN 2020acct-like events to be <3.3×10−8 Mpc−3 yr−1 at z = 0.07, or <0.1% of the total core-collapse SN rate.

SN 2021foa: The ‘Flip-Flop’ Type IIn / Ibn supernova
D. Farias, C. Gall, G. Narayan, S. Rest, V. A. Villar, C. R. Angus, K. Auchettl, K. W. Davis, R. Foley, A. Gagliano, J. Hjorth, L. Izzo, C. D. Kilpatrick, H .M. L. Perkins, E. Ramirez-Ruiz, C. L. Ransome, Sarangi. A., R. Yarza, D. A. Coulter, D. O. Jones, N. Khetan, A. Rest, M. R. Siebert, J. J. Swift, K. Taggart, S. Tinyanont, P. Wrubel, T. J. L. de Boer, K. E. Clever, A. Dhara, H. Gao, C.-C. Lin
[ arXiv:2409.01359 ]

Abstract

We present a comprehensive analysis of the photometric and spectroscopic evolution of SN~2021foa, unique among the class of transitional supernovae for repeatedly changing its spectroscopic appearance from hydrogen-to-helium-to-hydrogen-dominated (IIn-to-Ibn-to-IIn) within 50 days past peak brightness. The spectra exhibit multiple narrow (≈ 300--600~km~s−1) absorption lines of hydrogen, helium, calcium and iron together with broad helium emission lines with a full-width-at-half-maximum (FWHM) of ∼6000~km~s−1. For a steady, wind-mass loss regime, light curve modeling results in an ejecta mass of ∼8 M⊙ and CSM mass below 1 M⊙, and an ejecta velocity consistent with the FWHM of the broad helium lines. We obtain a mass-loss rate of ≈2 M⊙yr−1. This mass-loss rate is three orders of magnitude larger than derived for normal Type II SNe. We estimate that the bulk of the CSM of SN~2021foa must have been expelled within half a year, about 15 years ago. Our analysis suggests that SN~2021foa had a helium rich ejecta which swept up a dense shell of hydrogen rich CSM shortly after explosion. At about 60 days past peak brightness, the photosphere recedes through the dense ejecta-CSM region, occulting much of the red-shifted emission of the hydrogen and helium lines, which results in observed blue-shift (∼−3000~km~s−1). Strong mass loss activity prior to explosion, such as those seen in SN~2009ip-like objects and SN~2021foa as precursor emission, are the likely origin of a complex, multiple-shell CSM close to the progenitor star.

Finding the Fuse: Prospects for the Detection and Characterization of Hydrogen-Rich Core-Collapse Supernova Precursor Emission with the LSST
A. Gagliano, E. Berger, V. A. Villar, D. Hiramatsu, R. Kessler, T. Matsumoto, A. Gilkis, E. Laplace
The Astrophysical Journal, Volume 978, Number 1, 2024 [ arXiv:2408.13314 ]

Abstract

Enhanced emission in the months to years preceding explosion has been detected for several core-collapse supernovae (SNe). Though the physical mechanisms driving the emission remain hotly debated, the light curves of detected events show long-lived (≥50 days), plateau-like behavior, suggesting hydrogen recombination may significantly contribute to the total energy budget. The Vera C. Rubin Observatory's Legacy Survey of Space and Time (LSST) will provide a decade-long photometric baseline to search for this emission, both in binned pre-explosion observations after an SN is detected and in single-visit observations prior to the SN explosion. In anticipation of these searches, we simulate a range of eruptive precursor models to core-collapse SNe and forecast the discovery rates of these phenomena in LSST data. We find a detection rate of ~40-130 yr−1 for SN IIP/IIL precursors and ~110 yr−1 for SN IIn precursors in single-epoch photometry. Considering the first three years of observations with the effects of rolling and observing triplets included, this number grows to a total of 150-400 in binned photometry, with the highest number recovered when binning in 100-day bins for 2020tlf-like precursors and in 20-day bins for other recombination-driven models from the literature. We quantify the impact of using templates contaminated by residual light (from either long-lived or separate precursor emission) on these detection rates, and explore strategies for estimating baseline flux to mitigate these issues. Spectroscopic follow-up of the eruptions preceding core-collapse SNe and detected with LSST will offer important clues to the underlying drivers of terminal-stage mass loss in massive stars.

Multiple testing for signal-agnostic searches of new physics with machine learning
Gaia Grosso, Marco Letizia
The European Physical Journal, Volume 85, article number 4, 2025 [ arXiv:2408.12296 | code ]

Abstract

In this work, we address the question of how to enhance signal-agnostic searches by leveraging multiple testing strategies. Specifically, we consider hypothesis tests relying on machine learning, where model selection can introduce a bias towards specific families of new physics signals. We show that it is beneficial to combine different tests, characterised by distinct choices of hyperparameters, and that performances comparable to the best available test are generally achieved while providing a more uniform response to various types of anomalies. Focusing on the New Physics Learning Machine, a methodology to perform a signal-agnostic likelihood-ratio test, we explore a number of approaches to multiple testing, such as combining p-values and aggregating test statistics.

Enhancing events in neutrino telescopes through deep-learning-driven superresolution
Felix J. Yu, Nicholas Kamp, Carlos A. Argüelles
Physical Review D, 2025, Volume 111, Issue 4 [ arXiv:2408.08474 | code ]

Abstract

Recent discoveries by neutrino telescopes, such as the IceCube Neutrino Observatory, relied extensively on machine learning (ML) tools to infer physical quantities from the raw photon hits detected. Neutrino telescope reconstruction algorithms are limited by the sparse sampling of photons by the optical modules due to the relatively large spacing (10−100m) between them. In this letter, we propose a novel technique that learns photon transport through the detector medium through the use of deep learning-driven super-resolution of data events. These ``improved'' events can then be reconstructed using traditional or ML techniques, resulting in improved resolution. Our strategy arranges additional ``virtual'' optical modules within an existing detector geometry and trains a convolutional neural network to predict the hits on these virtual optical modules. We show that this technique improves the angular reconstruction of muons in a generic ice-based neutrino telescope. Our results readily extend to water-based neutrino telescopes and other event morphologies.

Moment Unfolding
Krish Desai, Benjamin Nachman, Jesse Thaler
Physical Review D, Vol. 110, Iss. 11, 2024 [ arXiv:2407.11284 | code ]

Abstract

Deconvolving ('unfolding') detector distortions is a critical step in the comparison of cross section measurements with theoretical predictions in particle and nuclear physics. However, most existing approaches require histogram binning while many theoretical predictions are at the level of statistical moments. We develop a new approach to directly unfold distribution moments as a function of another observable without having to first discretize the data. Our Moment Unfolding technique uses machine learning and is inspired by Generative Adversarial Networks (GANs). We demonstrate the performance of this approach using jet substructure measurements in collider physics. With this illustrative example, we find that our Moment Unfolding protocol is more precise than bin-based approaches and is as or more precise than completely unbinned methods.

Anomaly-aware summary statistic from data batches
Gaia Grosso
Journal of High Energy Physics, Volume 2024, article number 93, 2024 [ arXiv:2407.01249 ]

Abstract

Signal-agnostic data exploration based on machine learning could unveil very subtle statistical deviations of collider data from the expected Standard Model of particle physics. The beneficial impact of a large training sample on machine learning solutions motivates the exploration of increasingly large and inclusive samples of acquired data with resource efficient computational methods. In this work we consider the New Physics Learning Machine (NPLM), a multivariate goodness-of-fit test built on the Neyman-Pearson maximum-likelihood-ratio construction, and we address the problem of testing large size samples under computational and storage resource constraints. We propose to perform parallel NPLM routines over batches of the data, and to combine them by locally aggregating over the data-to-reference density ratios learnt by each batch. The resulting data hypothesis defining the likelihood-ratio test is thus shared over the batches, and complies with the assumption that the expected rate of new physical processes is time invariant. We show that this method outperforms the simple sum of the independent tests run over the batches, and can recover, or even surpass, the sensitivity of the single test run over the full data. Beside the significant advantage for the offline application of NPLM to large size samples, the proposed approach offers new prospects toward the use of NPLM to construct anomaly-aware summary statistics in quasi-online data streaming scenarios.

Demonstration of neutron identification in neutrino interactions in the MicroBooNE liquid argon time projection chamber
MicroBooNE collaboration: P. Abratenko, O. Alterkait, D. Andrade Aldana, L. Arellano, J. Asaadi, A. Ashkenazi, S. Balasubramanian, B. Baller, A. Barnard, G. Barr, D. Barrow, J. Barrow, V. Basque, J. Bateman, O. Benevides Rodrigues, S. Berkman, A. Bhanderi, A. Bhat, M. Bhattacharya, M. Bishai, A. Blake, B. Bogart, T. Bolton, J.Y. Book, M.B. Brunetti, L. Camilleri, Y. Cao, D. Caratelli, F. Cavanna, G. Cerati, A. Chappell, Y. Chen, J.M. Conrad, M. Convery, L. Cooper-Troendle, J.I. Crespo-Anadon, R. Cross, M. Del Tutto, S.R. Dennis, P. Detje, R. Diurba, Z. Djurcic, R. Dorrill, K. Duffy, S. Dytman, B. Eberly, P. Englezos, A. Ereditato, J.J. Evans, R. Fine, B.T. Fleming, W. Foreman, D. Franco, A.P. Furmanski, F. Gao, D. Garcia-Gamez, S. Gardiner, G. Ge, S. Gollapinni, E. Gramellini, P. Green, H. Greenlee, L. Gu, W. Gu, R. Guenette, P. Guzowski, L. Hagaman, M. D. Handley, O. Hen, C. Hilgenberg, G.A. Horton-Smith, Z. Imani, B. Irwin, M.S. Ismail, C. James, X. Ji, J.H. Jo, R.A. Johnson, Y.J. Jwa, D. Kalra, N. Kamp, G. Karagiorgi, W. Ketchum, M. Kirby, T. Kobilarcik, I. Kreslo, N. Lane, J.-Y. Li, Y. Li, K. Lin, B.R. Littlejohn, H. Liu, W.C. Louis, X. Luo, C. Mariani, D. Marsden, J. Marshall, N. Martinez, D.A. Martinez Caicedo et al.
The European Physical Journal C, Volume 84, article number 1052, 2024 [ arXiv:2406.10583 ]

Abstract

A significant challenge in measurements of neutrino oscillations is reconstructing the incoming neutrino energies. While modern fully-active tracking calorimeters such as liquid argon time projection chambers in principle allow the measurement of all final state particles above some detection threshold, undetected neutrons remain a considerable source of missing energy with little to no data constraining their production rates and kinematics. We present the first demonstration of tagging neutrino-induced neutrons in liquid argon time projection chambers using secondary protons emitted from neutron-argon interactions in the MicroBooNE detector. We describe the method developed to identify neutrino-induced neutrons and demonstrate its performance using neutrons produced in muon-neutrino charged current interactions. The method is validated using a small subset of MicroBooNE's total dataset. The selection yields a sample with 60% of selected tracks corresponding to neutron-induced secondary protons.

Improving neutrino energy estimation of charged-current interaction events with recurrent neural networks in MicroBooNE
MicroBooNE collaboration: P. Abratenko, O. Alterkait, D. Andrade Aldana, L. Arellano, J. Asaadi, A. Ashkenazi, S. Balasubramanian, B. Baller, A. Barnard, G. Barr, D. Barrow, J. Barrow, V. Basque, J. Bateman, O. Benevides Rodrigues, S. Berkman, A. Bhanderi, A. Bhat, M. Bhattacharya, M. Bishai, A. Blake, B. Bogart, T. Bolton, J.Y. Book, M.B. Brunetti, L. Camilleri, Y. Cao, D. Caratelli, F. Cavanna, G. Cerati, A. Chappell, Y. Chen, J.M. Conrad, M. Convery, L. Cooper-Troendle, J.I. Crespo-Anadon, R. Cross, M. Del Tutto, S.R. Dennis, P. Detje, R. Diurba, Z. Djurcic, R. Dorrill, K. Duffy, S. Dytman, B. Eberly, P. Englezos, A. Ereditato, J.J. Evans, R. Fine, B.T. Fleming, W. Foreman, D. Franco, A.P. Furmanski, F. Gao, D. Garcia-Gamez, S. Gardiner, G. Ge, S. Gollapinni, E. Gramellini, P. Green, H. Greenlee, L. Gu, W. Gu, R. Guenette, P. Guzowski, L. Hagaman, O. Hen, C. Hilgenberg, G.A. Horton-Smith, Z. Imani, B. Irwin, M.S. Ismail, C. James, X. Ji, J.H. Jo, R.A. Johnson, Y.J. Jwa, D. Kalra, N. Kamp, G. Karagiorgi, W. Ketchum, M. Kirby, T. Kobilarcik, I. Kreslo, N. Lane, I. Lepetic, J.-Y. Li, Y. Li, K. Lin, B.R. Littlejohn, H. Liu, W.C. Louis, X. Luo, C. Mariani, D. Marsden, J. Marshall, N. Martinez, D.A. Martinez Caicedo et al.
Physical Review D, 2024, Volume 110, Issue 9 [ arXiv:2406.10123 | code ]

Abstract

We present a deep learning-based method for estimating the neutrino energy of charged-current neutrino-argon interactions. We employ a recurrent neural network (RNN) architecture for neutrino energy estimation in the MicroBooNE experiment, utilizing liquid argon time projection chamber (LArTPC) detector technology. Traditional energy estimation approaches in LArTPCs, which largely rely on reconstructing and summing visible energies, often experience sizable biases and resolution smearing because of the complex nature of neutrino interactions and the detector response. The estimation of neutrino energy can be improved after considering the kinematics information of reconstructed final-state particles. Utilizing kinematic information of reconstructed particles, the deep learning-based approach shows improved resolution and reduced bias for the muon neutrino Monte Carlo simulation sample compared to the traditional approach. In order to address the common concern about the effectiveness of this method on experimental data, the RNN-based energy estimator is further examined and validated with dedicated data-simulation consistency tests using MicroBooNE data. We also assess its potential impact on a neutrino oscillation study after accounting for all statistical and systematic uncertainties and show that it enhances physics sensitivity. This method has good potential to improve the performance of other physics analyses.

SIREN: An Open Source Neutrino Injection Toolkit
Austin Schneider, Nicholas W. Kamp, Alex Y. Wen
[ arXiv:2406.01745 | code ]

Abstract

Modeling of rare neutrino processes often relies on either simple approximations or expensive detector simulations. The former is often not sufficient for interactions with complex morphologies, while the latter is too time-intensive for phenomenological studies. We present SIREN (Sampling and Injection for Rare EveNts), a new tool for neutrino phenomenology and experimental searches alike that enables accurate interaction and detector geometry modeling without the overhead of detailed detector response simulations. SIREN handles the injection of rare process final states and the associated weighting calculations with the speed needed for phenomenological investigations and the detail necessary for dedicated experimental searches. The extensible design of SIREN allows it to support a wide range of experimental designs and Beyond-Standard-Model neutrino interactions. Users need only specify the physical process, detector geometry, and initial neutrino flux under consideration before they can accurately simulate a model in their detector of choice. We demonstrate the capability of SIREN through two examples: (1) Standard Model νμ deep inelastic scattering in IceCube, DUNE, and ATLAS; and (2) heavy neutral lepton interactions in MiniBooNE, MINERνA, CCM. A variety of detector geometry descriptions, interaction cross sections, and neutrino fluxes are also provided for users to get started with immediately.

Towards Universal Unfolding of Detector Effects in High-Energy Physics using Denoising Diffusion Probabilistic Models
Camila Pazos, Shuchin Aeron, Pierre-Hugues Beauchemin, Vincent Croft, Martin Klassen, Taritree Wongjirad
[ arXiv:2406.01507 ]

Abstract

The unfolding of detector effects in experimental data is critical for enabling precision measurements in high-energy physics. However, traditional unfolding methods face challenges in scalability, flexibility, and dependence on simulations. We introduce a novel unfolding approach using conditional Denoising Diffusion Probabilistic Models (cDDPM). Our method utilizes the cDDPM for a non-iterative, flexible posterior sampling approach, which exhibits a strong inductive bias that allows it to generalize to unseen physics processes without explicitly assuming the underlying distribution. We test our approach by training a single cDDPM to perform multidimensional particle-wise unfolding for a variety of physics processes, including those not seen during training. Our results highlight the potential of this method as a step towards a 'universal' unfolding tool that reduces dependence on truth-level assumptions.

From Neurons to Neutrons: A Case Study in Interpretability
Ouail Kitouni, Niklas Nolte, Víctor Samuel Pérez-Díaz, Sokratis Trifinopoulos, Mike Williams
[ arXiv:2405.17425 | code ]

Abstract

Mechanistic Interpretability (MI) promises a path toward fully understanding how neural networks make their predictions. Prior work demonstrates that even when trained to perform simple arithmetic, models can implement a variety of algorithms (sometimes concurrently) depending on initialization and hyperparameters. Does this mean neuron-level interpretability techniques have limited applicability? We argue that high-dimensional neural networks can learn low-dimensional representations of their training data that are useful beyond simply making good predictions. Such representations can be understood through the mechanistic interpretability lens and provide insights that are surprisingly faithful to human-derived domain knowledge. This indicates that such approaches to interpretability can be useful for deriving a new understanding of a problem from models trained to solve it. As a case study, we extract nuclear physics concepts by studying models trained to reproduce nuclear data.

Resonant Neutrino Flavor Conversion in the Atmosphere
Connor Sponsler, Matheus Hostert, Ivan Martinez-Soler, Carlos A. Argüelles
[ arXiv:2405.12140 ]

Abstract

Neutrinos produced in the atmosphere traverse a column density of air before being detected at neutrino observatories like IceCube or KM3NeT. In this work, we extend the neutrino flavor evolution in the {nuSQuIDS} code accounting for the varying height of neutrino production and the variable air density in the atmosphere. These effects can lead to sizeable spectral distortions in standard neutrino oscillations and are crucial to accurately describe some new physics scenarios. As an example, we study a model of quasi-sterile neutrinos that induce resonant flavor conversions at neutrino energies of O(300) MeV in matter densities of 1 g/cm³. In atmospheric air densities, the same resonance is then realized at neutrino energies of O(300−700)~GeV. We find that the new resonance can deplete the $\nu$_μ+$\bar{\nu}$_μ flux at the IceCube Neutrino Observatory by as much as 10% in the direction of the horizon.

Measurement of atmospheric neutrino oscillation parameters using convolutional neural networks with 9.3 years of data in IceCube DeepCore
IceCube Collaboration
Physical Review Letters, 134, 091801, 2025 [ arXiv:2405.02163 ]

Abstract

The DeepCore sub-detector of the IceCube Neutrino Observatory provides access to neutrinos with energies above approximately 5 GeV. Data taken between 2012-2021 (3,387 days) are utilized for an atmospheric νμ disappearance analysis that studied 150,257 neutrino-candidate events with reconstructed energies between 5-100 GeV. An advanced reconstruction based on a convolutional neural network is applied, providing increased signal efficiency and background suppression, resulting in a measurement with both significantly increased statistics compared to previous DeepCore oscillation results and high neutrino purity. For the normal neutrino mass ordering, the atmospheric neutrino oscillation parameters and their 1σ errors are measured to be Δm232 = 2.40+0.05−0.04×10−3 eV2 and sin2θ23=0.54+0.04−0.03. The results are the most precise to date using atmospheric neutrinos, and are compatible with measurements from other neutrino detectors including long-baseline accelerator experiments.

Resimulation-based self-supervised learning for pretraining physics foundation models
Philip Harris, Michael Kagan, Jeffrey Krupa, Benedikt Maier, Nathaniel Woodward
Physical Review D, 2025, Volume 111, Issue 3 [ arXiv:2403.07066 ]

Abstract

Self-Supervised Learning (SSL) is at the core of training modern large machine learning models, providing a scheme for learning powerful representations that can be used in a variety of downstream tasks. However, SSL strategies must be adapted to the type of training data and downstream tasks required. We propose RS3L ('Re-simulation-based self-supervised representation learning'), a novel simulation-based SSL strategy that employs a method of re-simulation to drive data augmentation for contrastive learning in the physical sciences, particularly, in fields that rely on stochastic simulators. By intervening in the middle of the simulation process and re-running simulation components downstream of the intervention, we generate multiple realizations of an event, thus producing a set of augmentations covering all physics-driven variations available in the simulator. Using experiments from high-energy physics, we explore how this strategy may enable the development of a foundation model; we show how RS3L pre-training enables powerful performance in downstream tasks such as discrimination of a variety of objects and uncertainty mitigation. In addition to our results, we make the RS3L dataset publicly available for further studies on how to improve SSL strategies.

New Pathways in Neutrino Physics via Quantum-Encoded Data Analysis
Jeffrey Lazar, Santiago Giner Olavarrieta, Giancarlo Gatti, Carlos A. Argüelles, Mikel Sanz
[ arXiv:2402.19306 ]

Abstract

Ever-increasing amount of data is produced by particle detectors in their quest to unveil the laws of Nature. The large data rate requires the use of specialized triggers that promptly reduce the data rate to a manageable level; however, in doing so, unexpected new phenomena may escape detection. Additionally, the large data rate is increasingly difficult to analyze effectively, which has led to a recent revolution on machine learning techniques. Here, we present a methodology based on recent quantum compression techniques that has the capacity to store exponentially more amount of information than classically available methods. To demonstrate this, we encode the full neutrino telescope event information using parity observables in an IBM quantum processor using 8 qubits. Then we show that we can recover the information stored on the quantum computer with a fidelity of 84%. Finally, we illustrate the use of our protocol by performing a classification task that separates electron-neutrino events to muon-neutrinos events in a neutrino telescope. This new capability would eventually allow us to solve the street light effect in particle physics, where we only record signatures of particles with which we are familiar.

Seeing Double: Calibrating Two Jets at Once
Rikab Gambhir, Benjamin Nachman
Physical Review D, Volume 110, Issue 7, 2024 [ arXiv:2402.14067 | code ]

Abstract

Jet energy calibration is an important aspect of many measurements and searches at the LHC. Currently, these calibrations are performed on a per-jet basis, i.e. agnostic to the properties of other jets in the same event. In this work, we propose taking advantage of the correlations induced by momentum conservation between jets in order to improve their jet energy calibration. By fitting the pT asymmetry of dijet events in simulation, while remaining agnostic to the pT spectra themselves, we are able to obtain correlation-improved maximum likelihood estimates. This approach is demonstrated with simulated jets from the CMS Detector, yielding a 3-5% relative improvement in the jet energy resolution, corresponding to a quadrature improvement of approximately 35\%.

Applications of Lipschitz neural networks to the Run 3 LHCb trigger system
Blaise Delaney, Nicole Schulte, Gregory Ciezarek, Niklas Nolte, Mike Williams, Johannes Albrecht
[ arXiv:2312.14265 ]

Abstract

The operating conditions defining the current data taking campaign at the Large Hadron Collider, known as Run 3, present unparalleled challenges for the real-time data acquisition workflow of the LHCb experiment at CERN. To address the anticipated surge in luminosity and consequent event rate, the LHCb experiment is transitioning to a fully software-based trigger system. This evolution necessitated innovations in hardware configurations, software paradigms, and algorithmic design. A significant advancement is the integration of monotonic Lipschitz neural networks into the LHCb trigger system. These deep learning models offer certified robustness against detector instabilities, and the ability to encode domain-specific inductive biases. Such properties are crucial for the inclusive heavy-flavour triggers and, most notably, for the topological triggers designed to inclusively select b-hadron candidates by exploiting the unique kinematic and decay topologies of beauty decays. This paper describes the recent progress in integrating Lipschitz neural networks into the topological triggers, highlighting the resulting enhanced sensitivity to highly displaced multi-body candidates produced within the LHCb acceptance.

First search for dark-trident processes using the MicroBooNE detector
MicroBooNE collaboration
[ arXiv:2312.13945 ]

Abstract

We present a first search for dark-trident scattering in a neutrino beam using a data set corresponding to 7.2×1020 protons on target taken with the MicroBooNE detector at Fermilab. Proton interactions in the neutrino target at the Main Injector produce π0 and η mesons, which could decay into dark-matter (DM) particles mediated via a dark photon A′. A convolutional neural network is trained to identify interactions of the DM particles in the liquid-argon time projection chamber (LArTPC) exploiting its image-like reconstruction capability. In the absence of a DM signal, we provide limits at the 90% confidence level on the squared kinematic mixing parameter ε2 as a function of the dark-photon mass in the range 10≤MA′≤400 MeV. The limits cover previously unconstrained parameter space for the production of fermion or scalar DM particles χ for two benchmark models with mass ratios Mχ/MA′=0.6 and 2 and for dark fine-structure constants 0.1≤αD≤1.

Two Watts is All You Need: Enabling In-Detector Real-Time Machine Learning for Neutrino Telescopes Via Edge Computing
Miaochen Jin, Yushi Hu, Carlos A. Argüelles
[ arXiv:2311.04983 ]

Abstract

The use of machine learning techniques has significantly increased the physics discovery potential of neutrino telescopes. In the upcoming years, we are expecting upgrade of currently existing detectors and new telescopes with novel experimental hardware, yielding more statistics as well as more complicated data signals. This calls out for an upgrade on the software side needed to handle this more complex data in a more efficient way. Specifically, we seek low power and fast software methods to achieve real-time signal processing, where current machine learning methods are too expensive to be deployed in the resource-constrained regions where these experiments are located. We present the first attempt at and a proof-of-concept for enabling machine learning methods to be deployed in-detector for water/ice neutrino telescopes via quantization and deployment on Google Edge Tensor Processing Units (TPUs). We design a recursive neural network with a residual convolutional embedding, and adapt a quantization process to deploy the algorithm on a Google Edge TPU. This algorithm can achieve similar reconstruction accuracy compared with traditional GPU-based machine learning solutions while requiring the same amount of power compared with CPU-based regression solutions, combining the high accuracy and low power advantages and enabling real-time in-detector machine learning in even the most power-restricted environments.

Search for heavy neutral leptons in electron-positron and neutral-pion final states with the MicroBooNE detector
MicroBooNE collaboration
[ arXiv:2310.07660 ]

Abstract

We present the first search for heavy neutral leptons (HNL) decaying into νe+e− or νπ0 final states in a liquid-argon time projection chamber using data collected with the MicroBooNE detector. The data were recorded synchronously with the NuMI neutrino beam from Fermilab's Main Injector corresponding to a total exposure of 7.01×1020 protons on target. We set upper limits at the 90% confidence level on the mixing parameter |Uμ4|2 in the mass ranges 10≤mHNL≤150 MeV for the νe+e− channel and 150≤mHNL≤245 MeV for the νπ0 channel, assuming |Ue4|2=|Uτ4|2=0. These limits represent the most stringent constraints in the mass range 35<mHNL<175 MeV and the first constraints from a direct search for νπ0 decays.

Chained Quantile Morphing with Normalizing Flows
Samuel Bright-Thonney, Philip Harris, Patrick McCormack, Simon Rothman
[ arXiv:2309.15912 ]

Abstract

Accounting for inaccuracies in Monte Carlo simulations is a crucial step in any high energy physics analysis. It becomes especially important when training machine learning models, which can amplify simulation inaccuracies and introduce large discrepancies and systematic uncertainties when the model is applied to data. In this paper, we introduce a method to transform simulated events to better match data using normalizing flows, a class of deep learning-based density estimation models. Our proposal uses a technique called chained quantile morphing, which corrects a set of observables by iteratively shifting each entry according to a conditonal cumulative density function. We demonstrate the technique on a realistic particle physics dataset, and compare it to a neural network-based reweighting method. We also introduce a new contrastive learning technique to correct high dimensional particle-level inputs, which naively cannot be efficiently corrected with morphing strategies.

GWAK: Gravitational-Wave Anomalous Knowledge with Recurrent Autoencoders
Ryan Raikman, Eric A. Moreno, Ekaterina Govorkova, Ethan J Marx, Alec Gunny, William Benoit, Deep Chatterjee, Rafia Omer, Muhammed Saleem, Dylan S Rankin, Michael W Coughlin, Philip C Harris, Erik Katsavounidis
Journal of High Energy Physics 2024, Volume 2024, Article number 158 [ arXiv:2309.11537 | code ]

Abstract

Matched-filtering detection techniques for gravitational-wave (GW) signals in ground-based interferometers rely on having well-modeled templates of the GW emission. Such techniques have been traditionally used in searches for compact binary coalescences (CBCs), and have been employed in all known GW detections so far. However, interesting science cases aside from compact mergers do not yet have accurate enough modeling to make matched filtering possible, including core-collapse supernovae and sources where stochasticity may be involved. Therefore the development of techniques to identify sources of these types is of significant interest. In this paper, we present a method of anomaly detection based on deep recurrent autoencoders to enhance the search region to unmodeled transients. We use a semi-supervised strategy that we name Gravitational Wave Anomalous Knowledge (GWAK). While the semi-supervised nature of the problem comes with a cost in terms of accuracy as compared to supervised techniques, there is a qualitative advantage in generalizing experimental sensitivity beyond pre-computed signal templates. We construct a low-dimensional embedded space using the GWAK method, capturing the physical signatures of distinct signals on each axis of the space. By introducing signal priors that capture some of the salient features of GW signals, we allow for the recovery of sensitivity even when an unmodeled anomaly is encountered. We show that regions of the GWAK space can identify CBCs, detector glitches and also a variety of unmodeled astrophysical sources.

FLORAH: A generative model for halo assembly histories
Tri Nguyen, Chirag Modi, L. Y. Aaron Yung, Rachel S. Somerville
Monthly Notices of the Royal Astronomical Society, 2024, Volume 533, Issue 3 [ arXiv:2308.05145 | code ]

Abstract

The mass assembly history (MAH) of dark matter halos plays a crucial role in shaping the formation and evolution of galaxies. MAHs are used extensively in semi-analytic and empirical models of galaxy formation, yet current analytic methods to generate them are inaccurate and unable to capture their relationship with the halo internal structure and large-scale environment. This paper introduces FLORAH, a machine-learning framework for generating assembly histories of ensembles of dark matter halos. We train FLORAH on the assembly histories from the GUREFT and VSMDPL N-body simulations and demonstrate its ability to recover key properties such as the time evolution of mass and concentration. We obtain similar results for the galaxy stellar mass versus halo mass relation and its residuals when we run the Santa Cruz semi-analytic model on FLORAH-generated assembly histories and halo formation histories extracted from an N-body simulation. We further show that FLORAH also reproduces the dependence of clustering on properties other than mass (assembly bias), which is not captured by other analytic methods. By combining multiple networks trained on a suite of simulations with different redshift ranges and mass resolutions, we are able to construct accurate main progenitor branches (MPBs) with a wide dynamic mass range from z=0 up to an ultra-high redshift z≈20, currently far beyond that of a single N-body simulation. FLORAH is the first step towards a machine learning-based framework for planting full merger trees; this will enable the exploration of different galaxy formation scenarios with great computational efficiency at unprecedented accuracy.

First demonstration for a LArTPC-based search for intranuclear neutron-antineutron transitions and annihilation in 40Ar using the MicroBooNE detector
MicroBooNE collaboration
[ arXiv:2308.03924 ]

Abstract

In this paper, we present a novel methodology to search for intranuclear neutron-antineutron transition (n→n¯) followed by annihilation within an 40Ar nucleus, using the MicroBooNE liquid argon time projection chamber (LArTPC) detector. A discovery of n→n¯ transition or increased lower limit on the lifetime of this process would either constitute physics beyond the Standard Model or greatly constrain theories of baryogenesis, respectively. The approach presented in this paper makes use of deep learning methods to select n→n¯ events based on their unique features and differentiate them from cosmogenic backgrounds. The achieved signal and background efficiencies are (70±6)\% and (0.0020±0.0003)\%, respectively. A demonstration of a search is performed with a data set corresponding to an exposure of 3.32×1026neutron-years, and where the background rate is constrained through direct measurement, assuming the presence of a negligible signal. With this approach, no excess of events over the background prediction is observed, setting a demonstrative lower bound on the n→n¯ lifetime in 40Ar of τm>1.1×1026years, and on the free n→n¯ transition time of τn−n¯>2.6×105s, each at the 90% confidence level. This analysis represents a first-ever proof-of-principle demonstration of the ability to search for this rare process in LArTPCs with high efficiency and low background.

NuCLR, Nuclear Co-Learned Representations
Ouail Kitouni, Niklas Nolte, Sokratis Trifinopoulos, Subhash Kantamneni, Mike Williams
[ arXiv:2306.06099 ]

Abstract

We introduce Nuclear Co-Learned Representations (NuCLR), a deep learning model that predicts various nuclear observables, including binding and decay energies, and nuclear charge radii. The model is trained using a multi-task approach with shared representations and obtains state-of-the-art performance, achieving levels of precision that are crucial for understanding fundamental phenomena in nuclear (astro)physics. We also report an intriguing finding that the learned representation of NuCLR exhibits the prominent emergence of crucial aspects of the nuclear shell model, namely the shell structure, including the well-known magic numbers, and the Pauli Exclusion Principle. This suggests that the model is capable of capturing the underlying physical principles and that our approach has the potential to offer valuable insights into nuclear theory.

Development of the Topological Trigger for LHCb Run 3
Nicole Schulte, Blaise Raheem Delaney, Niklas Nolte, Gregory Max Ciezarek, Johannes Albrecht, Mike Williams
[ arXiv:2306.09873 ]

Abstract

The data-taking conditions expected in Run 3 of the LHCb experiment at CERN are unprecedented and challenging for the software and computing systems. Despite that, the LHCb collaboration pioneers the use of a software-only trigger system to cope with the increased event rate efficiently. The beauty physics programme of LHCb is heavily reliant on topological triggers. These are devoted to selecting beauty-hadron candidates inclusively, based on the characteristic decay topology and kinematic properties expected from beauty decays. The following proceeding describes the current progress of the Run 3 implementation of the topological triggers using Lipschitz monotonic neural networks. This architecture offers robustness under varying detector conditions and sensitivity to long-lived candidates, improving the possibility of discovering New Physics at LHCb.

Symbolic Regression on FPGAs for Fast Machine Learning Inference
Ho Fung Tsoi, Adrian Alan Pol, Vladimir Loncar, Ekaterina Govorkova, Miles Cranmer, Sridhara Dasu, Peter Elmer, Philip Harris, Isobel Ojalvo, Maurizio Pierini
EPJ Web of Conferences 2024, Volume 295 [ arXiv:2305.04099 | code ]

Abstract

The high-energy physics community is investigating the feasibility of deploying machine-learning-based solutions on Field-Programmable Gate Arrays (FPGAs) to improve physics sensitivity while meeting data processing latency limitations. In this contribution, we introduce a novel end-to-end procedure that utilizes a machine learning technique called symbolic regression (SR). It searches equation space to discover algebraic relations approximating a dataset. We use PySR (software for uncovering these expressions based on evolutionary algorithm) and extend the functionality of hls4ml (a package for machine learning inference in FPGAs) to support PySR-generated expressions for resource-constrained production environments. Deep learning models often optimise the top metric by pinning the network size because vast hyperparameter space prevents extensive neural architecture search. Conversely, SR selects a set of models on the Pareto front, which allows for optimising the performance-resource tradeoff directly. By embedding symbolic forms, our implementation can dramatically reduce the computational resources needed to perform critical tasks. We validate our procedure on a physics benchmark: multiclass classification of jets produced in simulated proton-proton collisions at the CERN Large Hadron Collider, and show that we approximate a 3-layer neural network with an inference model that has as low as 5 ns execution time (a reduction by a factor of 13) and over 90% approximation accuracy.

Pileup and Infrared Radiation Annihilation (PIRANHA): A Paradigm for Continuous Jet Grooming
Samuel Alipour-Fard, Patrick T. Komiske, Eric M. Metodiev, Jesse Thaler
Journal of High Energy Physics 2023, Volume 2023, Article number 157 [ arXiv:2305.00989 | code ]

Abstract

Jet grooming is an important strategy for analyzing relativistic particle collisions in the presence of contaminating radiation. Most jet grooming techniques introduce hard cutoffs to remove soft radiation, leading to discontinuous behavior and associated experimental and theoretical challenges. In this paper, we introduce Pileup and Infrared Radiation Annihilation (PIRANHA), a paradigm for continuous jet grooming that overcomes the discontinuity and infrared sensitivity of hard-cutoff grooming procedures. We motivate PIRANHA from the perspective of optimal transport and the Energy Movers Distance and review Apollonius Subtraction and Iterated Voronoi Subtraction as examples of PIRANHA-style grooming. We then introduce a new tree-based implementation of PIRANHA, Recursive Subtraction, with reduced computational costs. Finally, we demonstrate the performance of Recursive Subtraction in mitigating sensitivity to soft distortions from hadronization and detector effects, and additive contamination from pileup and the underlying event.

Prometheus: An Open-Source Neutrino Telescope Simulation
Jeffrey Lazar, Stephan Meighen-Berger, Christian Haack, David Kim, Santiago Giner, Carlos A. Argüelles
[ arXiv:2304.14526 | code ]

Abstract

Neutrino telescopes are gigaton-scale neutrino detectors comprised of individual light-detection units. Though constructed from simple building blocks, they have opened a new window to the Universe and are able to probe center-of-mass energies that are comparable to those of collider experiments. \prometheus{} is a new, open-source simulation tailored for this kind of detector. Our package, which is written in a combination of \texttt{C++} and \texttt{Python} provides a balance of ease of use and performance and allows the user to simulate a neutrino telescope with arbitrary geometry deployed in ice or water. \prometheus{} simulates the neutrino interactions in the volume surrounding the detector, computes the light yield of the hadronic shower and the out-going lepton, propagates the photons in the medium, and records their arrival times and position in user-defined regions. Finally, \prometheus{} events are serialized into a \texttt{parquet} file, which is a compact and interoperational file format that allows prompt access to the events for further analysis.

Expressive Monotonic Neural Networks
Niklas Nolte, Ouail Kitouni, Mike Williams
International Conference on Learning Representations 2023 [ ]

Abstract

The monotonic dependence of the outputs of a neural network on some of its inputs is a crucial inductive bias in many scenarios where domain knowledge dic- tates such behavior. This is especially important for interpretability and fairness considerations. In a broader context, scenarios in which monotonicity is impor- tant can be found in finance, medicine, physics, and other disciplines. It is thus desirable to build neural network architectures that implement this inductive bias provably. In this work, we propose a weight-constrained architecture with a single residual connection to achieve exact monotonic dependence in any subset of the inputs. The weight constraint scheme directly controls the Lipschitz constant of the neural network and thus provides the additional benefit of robustness. Com- pared to currently existing techniques used for monotonicity, our method is sim- pler in implementation and in theory foundations, has negligible computational overhead, is guaranteed to produce monotonic dependence, and is highly expres- sive. We show how the algorithm is used to train powerful, robust, and inter- pretable discriminators that achieve competitive performance compared to current state-of-the-art methods across various benchmarks, from social applications to the classification of the decays of subatomic particles produced at the CERN Large Hadron Collider.

Non-perturbative strong coupling at timelike momenta
Jan Horak, Jan M. Pawlowski, Jonas Turnwald, Julian M. Urban, Nicolas Wink, Savvas Zafeiropoulos
Physical Review D 2023, Volume 107, Issue 7 [ arXiv:2301.08128 ]

Abstract

We compute the strong coupling constant of Landau gauge QCD in the full complex momentum plane, both directly and via spectral reconstruction. In particular, we consider the Taylor coupling given by the product of ghost and gluon dressing functions. Assuming spectral representations for the latter, we first show that also the coupling obeys such a representation. The subsequent spectral reconstruction of the coupling data, obtained from 2+1 flavour lattice QCD results for the ghost and gluon, is based on a probabilistic inversion of this representation using Gaussian process regression with analytically enforced asymptotics. In contradistinction, our direct calculation relies on earlier reconstruction results for the ghost and gluon spectral functions themselves, as well as data obtained in functional QCD. Apart from its relevance for studies of resonances or scattering processes, the calculation also serves as a non-trivial benchmark of our reconstruction approach. The results show remarkable agreement, testifying to the reliability of the method.

EPiC-GAN: Equivariant Point Cloud Generation for Particle Jets
Erik Buhmann, Gregor Kasieczka, Jesse Thaler
SciPost Physics, 2023, Volume 15, Issue 4 [ arXiv:2301.08128 | code ]

Abstract

With the vast data-collecting capabilities of current and future high-energy collider experiments, there is an increasing demand for computationally efficient simulations. Generative machine learning models enable fast event generation, yet so far these approaches are largely constrained to fixed data structures and rigid detector geometries. In this paper, we introduce EPiC-GAN - equivariant point cloud generative adversarial network - which can produce point clouds of variable multiplicity. This flexible framework is based on deep sets and is well suited for simulating sprays of particles called jets. The generator and discriminator utilize multiple EPiC layers with an interpretable global latent vector. Crucially, the EPiC layers do not rely on pairwise information sharing between particles, which leads to a significant speed-up over graph- and transformer-based approaches with more complex relation diagrams. We demonstrate that EPiC-GAN scales well to large particle multiplicities and achieves high generation fidelity on benchmark jet generation tasks.

Comparing Point Cloud Strategies for Collider Event Classification
Peter Onyisi, Delon Shen, Jesse Thaler
Physical Review D, 2023, Volume 108, Issue 1 [ arXiv:2212.10659 | code ]

Abstract

In this paper, we compare several event classification architectures defined on the point cloud representation of collider events. These approaches, which are based on the frameworks of deep sets and edge convolutions, circumvent many of the difficulties associated with traditional feature engineering. To benchmark our architectures against more traditional event classification strategies, we perform a case study involving Higgs boson decays to tau leptons. We find a 2.5 times increase in performance compared to a baseline ATLAS analysis with engineered features. Our point cloud architectures can be viewed as simplified versions of graph neural networks, where each particle in the event corresponds to a graph node. In our case study, we find the best balance of performance and computational cost for simple pairwise architectures, which are based on learned edge features.

Variational Neural-Network Ansatz for Continuum Quantum Field Theory
John M. Martyn, Khadijeh Najafi, Di Luo
APS Journals 2023, Volume 131, Issue 8 [ arXiv:2212.00782 | code ]

Abstract

Physicists dating back to Feynman have lamented the difficulties of applying the variational principle to quantum field theories. In non-relativistic quantum field theories, the challenge is to parameterize and optimize over the infinitely many n-particle wave functions comprising the state's Fock space representation. Here we approach this problem by introducing neural-network quantum field states, a deep learning ansatz that enables application of the variational principle to non-relativistic quantum field theories in the continuum. Our ansatz uses the Deep Sets neural network architecture to simultaneously parameterize all of the n-particle wave functions comprising a quantum field state. We employ our ansatz to approximate ground states of various field theories, including an inhomogeneous system and a system with long-range interactions, thus demonstrating a powerful new tool for probing quantum field theories.

Search for boosted Higgs boson decay to a charm quark-antiquark pair in proton-proton collisions at s√ = 13 TeV
CMS Collaboration
Physical Review Letters, 2023, Volume 131, Issue 4 [ arXiv:2211.14181 ]

Abstract

A search for the standard model (SM) Higgs boson (H) produced with transverse momentum greater than 450 GeV and decaying to a charm quark-antiquark (cc¯) pair is presented. The search is performed using proton-proton collision data collected at s√ = 13 TeV by the CMS experiment at the LHC, corresponding to an integrated luminosity of 138 fb−1. Boosted H→cc¯ decay products are reconstructed as a single large-radius jet and identified using a deep neural network charm tagging technique. The method is validated by measurement of the Z→cc¯ decay process, which is observed with a signal strength of 1.00+0.17−0.14 (syst) ± 0.08 (theo) ± 0.06 (stat), defined as the ratio of the observed process rate to the standard model expectation. The observed (expected) upper limit on σ(H)(H→cc¯) is set at 47 (39) times the SM prediction at 95% confidence level.

Finding NEEMo: Geometric Fitting using Neural Estimation of the Energy Mover’s Distance
Ouail Kitouni, Niklas Nolte, Mike Williams
[ arXiv:2209.15624 | code ]

Abstract

A novel neural architecture was recently developed that enforces an exact upper bound on the Lipschitz constant of the model by constraining the norm of its weights in a minimal way, resulting in higher expressiveness compared to other techniques. We present a new and interesting direction for this architecture: estimation of the Wasserstein metric (Earth Mover's Distance) in optimal transport by employing the Kantorovich-Rubinstein duality to enable its use in geometric fitting applications. Specifically, we focus on the field of high-energy particle physics, where it has been shown that a metric for the space of particle-collider events can be defined based on the Wasserstein metric, referred to as the Energy Mover's Distance (EMD). This metrization has the potential to revolutionize data-driven collider phenomenology. The work presented here represents a major step towards realizing this goal by providing a differentiable way of directly calculating the EMD. We show how the flexibility that our approach enables can be used to develop novel clustering algorithms.

Neural Embedding: Learning the Embedding of the Manifold of Physics Data
Sang Eon Park, Philip Harris, Bryan Ostdiek
Journal of High Energy Physics, 2023, Volume 2023, Article 108 [ arXiv:2208.05484 ]

Abstract

In this paper, we present a method of embedding physics data manifolds with metric structure into lower dimensional spaces with simpler metrics, such as Euclidean and Hyperbolic spaces. We then demonstrate that it can be a powerful step in the data analysis pipeline for many applications. Using progressively more realistic simulated collisions at the Large Hadron Collider, we show that this embedding approach learns the underlying latent structure. With the notion of volume in Euclidean spaces, we provide for the first time a viable solution to quantifying the true search capability of model agnostic search algorithms in collider physics (i.e. anomaly detection). Finally, we discuss how the ideas presented in this paper can be employed to solve many practical challenges that require the extraction of physically meaningful representations from information in complex high dimensional datasets.

Bias and Priors in Machine Learning Calibrations for High Energy Physics
Rikab Gambhir, Benjamin Nachman, Jesse Thaler
Physical Review D, Volume 106, Article 036011 [ arXiv:2205.05084 | code ]

Abstract

Machine learning offers an exciting opportunity to improve the calibration of nearly all reconstructed objects in high-energy physics detectors. However, machine learning approaches often depend on the spectra of examples used during training, an issue known as prior dependence. This is an undesirable property of a calibration, which needs to be applicable in a variety of environments. The purpose of this paper is to explicitly highlight the prior dependence of some machine learning-based calibration strategies. We demonstrate how some recent proposals for both simulation-based and data-based calibrations inherit properties of the sample used for training, which can result in biases for downstream analyses. In the case of simulation-based calibration, we argue that our recently proposed Gaussian Ansatz approach can avoid some of the pitfalls of prior dependence, whereas prior-independent data-based calibration remains an open problem.

Learning Uncertainties the Frequentist Way: Calibration and Correlation in High Energy Physics
Rikab Gambhir, Benjamin Nachman, Jesse Thaler
Physical Review Letters, 2022, Volume 129, Article 082001 [ arXiv:2205.03413 | code ]

Abstract

Calibration is a common experimental physics problem, whose goal is to infer the value and uncertainty of an unobservable quantity Z given a measured quantity X. Additionally, one would like to quantify the extent to which X and Z are correlated. In this paper, we present a machine learning framework for performing frequentist maximum likelihood inference with Gaussian uncertainty estimation, which also quantifies the mutual information between the unobservable and measured quantities. This framework uses the Donsker-Varadhan representation of the Kullback-Leibler divergence -- parametrized with a novel Gaussian Ansatz -- to enable a simultaneous extraction of the maximum likelihood values, uncertainties, and mutual information in a single training. We demonstrate our framework by extracting jet energy corrections and resolution factors from a simulation of the CMS detector at the Large Hadron Collider. By leveraging the high-dimensional feature space inside jets, we improve upon the nominal CMS jet resolution by upwards of 15%.

Robust and Provably Monotonic Networks
Ouail Kitouni, Niklas Nolte, Mike Williams
Machine Learning: Science and Technology, Volume 4, Number 3, 2023 [ arXiv:2112.00038 | code ]

Abstract

The Lipschitz constant of the map between the input and output space represented by a neural network is a natural metric for assessing the robustness of the model. We present a new method to constrain the Lipschitz constant of dense deep learning models that can also be generalized to other architectures. The method relies on a simple weight normalization scheme during training that ensures the Lipschitz constant of every layer is below an upper limit specified by the analyst. A simple residual connection can then be used to make the model monotonic in any subset of its inputs, which is useful in scenarios where domain knowledge dictates such dependence. Examples can be found in algorithmic fairness requirements or, as presented here, in the classification of the decays of subatomic particles produced at the CERN Large Hadron Collider. Our normalization is minimally constraining and allows the underlying architecture to maintain higher expressiveness compared to other techniques which aim to either control the Lipschitz constant of the model or ensure its monotonicity. We show how the algorithm was used to train a powerful, robust, and interpretable discriminator for heavy-flavor decays in the LHCb realtime data-processing system.

Challenges for Unsupervised Anomaly Detection in Particle Physics
Katherine Fraser, Samuel Homiller, Rashmish K. Mishra, Bryan Ostdiek, Matthew D. Schwartz
Journal of High Energy Physics, 2022, Volume 2022, Article Number 66 [ arXiv:2110.06948 ]

Abstract

Anomaly detection relies on designing a score to determine whether a particular event is uncharacteristic of a given background distribution. One way to define a score is to use autoencoders, which rely on the ability to reconstruct certain types of data (background) but not others (signals). In this paper, we study some challenges associated with variational autoencoders, such as the dependence on hyperparameters and the metric used, in the context of anomalous signal (top and W) jets in a QCD background. We find that the hyperparameter choices strongly affect the network performance and that the optimal parameters for one signal are non-optimal for another. In exploring the networks, we uncover a connection between the latent space of a variational autoencoder trained using mean-squared-error and the optimal transport distances within the dataset. We then show that optimal transport distances to representative events in the background dataset can be used directly for anomaly detection, with performance comparable to the autoencoders. Whether using autoencoders or optimal transport distances for anomaly detection, we find that the choices that best represent the background are not necessarily best for signal identification. These challenges with unsupervised anomaly detection bolster the case for additional exploration of semi-supervised or alternative approaches.

Presenting Unbinned Differential Cross Section Results
Miguel Arratia, Anja Butter, Mario Campanelli, Vincent Croft, Aishik Ghosh, Dag Gillberg, Kristin Lohwasser, Bogdan Malaescu, Vinicius Mikuni, Benjamin Nachman, Juan Rojo, Jesse Thaler, Ramon Winterhalder
Journal of Instrumentation, 2022, Volume 17 [ arXiv:2109.13243 | code ]

Abstract

Machine learning tools have empowered a qualitatively new way to perform differential cross section measurements whereby the data are unbinned, possibly in many dimensions. Unbinned measurements can enable, improve, or at least simplify comparisons between experiments and with theoretical predictions. Furthermore, many-dimensional measurements can be used to define observables after the measurement instead of before. There is currently no community standard for publishing unbinned data. While there are also essentially no measurements of this type public, unbinned measurements are expected in the near future given recent methodological advances. The purpose of this paper is to propose a scheme for presenting and using unbinned results, which can hopefully form the basis for a community standard to allow for integration into analysis workflows. This is foreseen to be the start of an evolving community dialogue, in order to accommodate future developments in this field that is rapidly evolving.

Neural Conditional Reweighting
Benjamin Nachman, Jesse Thaler
Physical Review D, Volume 105, Article 076015 [ arXiv:2107.08979 ]

Abstract

There is a growing use of neural network classifiers as unbinned, high-dimensional (and variable-dimensional) reweighting functions. To date, the focus has been on marginal reweighting, where a subset of features are used for reweighting while all other features are integrated over. There are some situations, though, where it is preferable to condition on auxiliary features instead of marginalizing over them. In this paper, we introduce neural conditional reweighting, which extends neural marginal reweighting to the conditional case. This approach is particularly relevant in high-energy physics experiments for reweighting detector effects conditioned on particle-level truth information. We leverage a custom loss function that not only allows us to achieve neural conditional reweighting through a single training procedure, but also yields sensible interpolation even in the presence of phase space holes. As a specific example, we apply neural conditional reweighting to the energy response of high-energy jets, which could be used to improve the modeling of physics objects in parametrized fast simulation packages.

The Dark Machines Anomaly Score Challenge: Benchmark Data and Model Independent Event Classification for the Large Hadron Collider
T. Aarrestad, M. Van Beekveld, M. Bona, A. Bovenin, S. Caron, J. Davies, A. De Simone, C. Doglioni, J.M. Duarte, A. Farbin, H. Gupta, L. Hendriks, L. Heinrich, J. Howarth, P. Jawahar, A. Jueid, J. Lastow, A. Leinweber, J. Mamuzic, E. Merényi, A. Morandini, P. Moskvitina, C. Nellist, J. Ngadiuba, B. Ostdiek, M. Pierini, B. Ravina, R. Ruiz de Austri, S. Sekmen, M. Touranakou, M. Vaškevičiūte, R. Vilalta, J.-R. Vlimant, R. Verheyen, M. White, E. Wulff, E. Wallin, K.A. Wozniak, Z. Zhang
SciPost Physics, 2022, Volume 12, Issue 1, Page 43 [ arXiv:2105.14027 | code ]

Abstract

We describe the outcome of a data challenge conducted as part of the Dark Machines initiative and the Les Houches 2019 workshop on Physics at TeV colliders. The challenged aims at detecting signals of new physics at the LHC using unsupervised machine learning algorithms. First, we propose how an anomaly score could be implemented to define model-independent signal regions in LHC searches. We define and describe a large benchmark dataset, consisting of > 1 Billion simulated LHC events corresponding to 10 fb−1 of proton-proton collisions at a center-of-mass energy of 13 TeV. We then review a wide range of anomaly detection and density estimation algorithms, developed in the context of the data challenge, and we measure their performance in a set of realistic analysis environments. We draw a number of useful conclusions that will aid the development of unsupervised new physics searches during the third run of the LHC, and provide our benchmark dataset for future studies at https://www.phenoMLdata.org. Code to reproduce the analysis is provided at https://github.com/bostdiek/DarkMachines-UnsupervisedChallenge.

A reconfigurable neural network ASIC for detector front-end data compression at the HL-LHC
Giuseppe Di Guglielmo, Farah Fahim, Christian Herwig, Manuel Blanco Valentin, Javier Duarte, Cristian Gingu, Philip Harris, James Hirschauer, Martin Kwok, Vladimir Loncar, Yingyi Luo, Llovizna Miranda, Jennifer Ngadiuba, Daniel Noonan, Seda Ogrenci-Memik, Maurizio Pierini, Sioni Summers, Nhan Tran
IEEE Transactions on Nuclear Science, 2021, Vol. 68, Issue 8 [ arXiv:2105.01683 ]

Abstract

Despite advances in the programmable logic capabilities of modern trigger systems, a significant bottleneck remains in the amount of data to be transported from the detector to off-detector logic where trigger decisions are made. We demonstrate that a neural network autoencoder model can be implemented in a radiation tolerant ASIC to perform lossy data compression alleviating the data transmission problem while preserving critical information of the detector energy profile. For our application, we consider the high-granularity calorimeter from the CMS experiment at the CERN Large Hadron Collider. The advantage of the machine learning approach is in the flexibility and configurability of the algorithm. By changing the neural network weights, a unique data compression algorithm can be deployed for each sensor in different detector regions, and changing detector or collider conditions. To meet area, performance, and power constraints, we perform a quantization-aware training to create an optimized neural network hardware implementation. The design is achieved through the use of high-level synthesis tools and the hls4ml framework, and was processed through synthesis and physical layout flows based on a LP CMOS 65 nm technology node. The flow anticipates 200 Mrad of ionizing radiation to select gates, and reports a total area of 3.6 mm^2 and consumes 95 mW of power. The simulated energy consumption per inference is 2.4 nJ. This is the first radiation tolerant on-detector ASIC implementation of a neural network that has been designed for particle physics applications.

Towards Designing and Exploiting Generative Networks for Neutrino Physics Experiments using Liquid Argon Time Projection Chambers
Paul Lutkus, Taritree Wongjirad, Schuchin Aeron
Conference paper at ICLR 2021 [ arXiv:2204.02496 | code ]

Abstract

In this paper, we show that a hybrid approach to generative modeling via combin- ing the decoder from an autoencoder together with an explicit generative model for the latent space is a promising method for producing images of particle tra- jectories in a liquid argon time projection chamber (LArTPC). LArTPCs are a type of particle physics detector used by several current and future experiments focused on studies of the neutrino. We implement a Vector-Quantized Variational Autoencoder (VQ-VAE) and PixelCNN which produces images with LArTPC- like features and introduce a method to evaluate the quality of the images using a semantic segmentation that identifies important physics-based features.

hls4ml: An Open-Source Codesign Workflow to Empower Scientific Low-Power Machine Learning Devices
Farah Fahim, Benjamin Hawks, Christian Herwig, James Hirschauer, Sergo Jindariani, Nhan Tran, Luca P. Carloni, Giuseppe Di Guglielmo, Philip Harris, Jeffrey Krupa, Dylan Rankin, Manuel Blanco Valentin, Josiah Hester, Yingyi Luo, John Mamish, Seda Orgrenci-Memik, Thea Aarrestad, Hamza Javed, Vladimir Loncar, Maurizio Pierini, Adrian Alan Pol, Sioni Summers, Javier Duarte, Scott Hauck, Shih-Chieh Hsu, Jennifer Ngadiuba, Mia Liu, Duc Hoang, Edward Kreinar, Zhenbin Wu
[ arXiv:2103.05579 ]

Abstract

Accessible machine learning algorithms, software, and diagnostic tools for energy-efficient devices and systems are extremely valuable across a broad range of application domains. In scientific domains, real-time near-sensor processing can drastically improve experimental design and accelerate scientific discoveries. To support domain scientists, we have developed hls4ml, an open-source software-hardware codesign workflow to interpret and translate machine learning algorithms for implementation with both FPGA and ASIC technologies. We expand on previous hls4ml work by extending capabilities and techniques towards low-power implementations and increased usability: new Python APIs, quantization-aware pruning, end-to-end FPGA workflows, long pipeline kernels for low power, and new device backends include an ASIC workflow. Taken together, these and continued efforts in hls4ml will arm a new generation of domain scientists with accessible, efficient, and powerful tools for machine-learning-accelerated discovery.

The LHC Olympics 2020: A Community Challenge for Anomaly Detection in High Energy Physics
Gregor Kasieczka (ed), Benjamin Nachman (ed), David Shih (ed), Oz Amram, Anders Andreassen, Kees Benkendorfer, Blaz Bortolato, Gustaaf Brooijmans, Florencia Canelli, Jack H. Collins, Biwei Dai, Felipe F. De Freitas, Barry M. Dillon, Ioan-Mihail Dinu, Zhongtian Dong, Julien Donini, Javier Duarte, D. A. Faroughy, Julia Gonski, Philip Harris, Alan Kahn, Jernej F. Kamenik, Charanjit K. Khosa, Patrick Komiske, Luc Le Pottier, Pablo Martín-Ramiro, Andrej Matevc, Eric Metodiev, Vinicius Mikuni, Inês Ochoa, Sang Eon Park, Maurizio Pierini, Dylan Rankin, Veronica Sanz, Nilai Sarda, Urous Seljak, Aleks Smolkovic, George Stein, Cristina Mantilla Suarez, Manuel Szewc, Jesse Thaler, Steven Tsan, Silviu-Marian Udrescu, Louis Vaslin, Jean-Roch Vlimant, Daniel Williams, Mikaeel Yunus
Reports on Progress in Physics, 2021, Volume 84, Number 12 [ arXiv:2101.08320 ]

Abstract

A new paradigm for data-driven, model-agnostic new physics searches at colliders is emerging, and aims to leverage recent breakthroughs in anomaly detection and machine learning. In order to develop and benchmark new anomaly detection methods within this framework, it is essential to have standard datasets. To this end, we have created the LHC Olympics 2020, a community challenge accompanied by a set of simulated collider events. Participants in these Olympics have developed their methods using an R&D dataset and then tested them on black boxes: datasets with an unknown anomaly (or not). This paper will review the LHC Olympics 2020 challenge, including an overview of the competition, a description of methods deployed in the competition, lessons learned from the experience, and implications for data analyses with future datasets as well as future colliders.

E Pluribus Unum Ex Machina: Learning from Many Collider Events at Once
Benjamin Nachman and Jesse Thaler
Physical Review D, 2021, Vol. 103, Issue 11, Article 116013 [ arXiv:2101.07263 | code ]

Abstract

There have been a number of recent proposals to enhance the performance of machine learning strategies for collider physics by combining many distinct events into a single ensemble feature. To evaluate the efficacy of these proposals, we study the connection between single-event classifiers and multi-event classifiers under the assumption that collider events are independent and identically distributed (IID). We show how one can build optimal multi-event classifiers from single-event classifiers, and we also show how to construct multi-event classifiers such that they produce optimal single-event classifiers. This is illustrated for a Gaussian example as well as for classification tasks relevant for searches and measurements at the Large Hadron Collider. We extend our discussion to regression tasks by showing how they can be phrased in terms of parametrized classifiers. Empirically, we find that training a single-event (per-instance) classifier is more effective than training a multi-event (per-ensemble) classifier, as least for the cases we studied, and we relate this fact to properties of the loss function gradient in the two cases. While we did not identify a clear benefit from using multi-event classifiers in the collider context, we speculate on the potential value of these methods in cases involving only approximate independence, as relevant for jet substructure studies.

Fast convolutional neural networks on FPGAs with hls4ml
Thea Aarrestad, Vladimir Loncar, Nicolò Ghielmetti, Maurizio Pierini, Sioni Summers, Jennifer Ngadiuba, Christoffer Petersson, Hampus Linander, Yutaro Iiyama, Giuseppe Di Guglielmo, Javier Duarte, Philip Harris, Dylan Rankin, Sergo Jindariani, Kevin Pedro, Nhan Tran, Mia Liu, Edward Kreinar, Zhenbin Wu, Duc Hoang
Machine Learning Science and Technology, 2021, Volume 2, Issue 4, Article 045015 [ arXiv:2101.05108 | code ]

Abstract

We introduce an automated tool for deploying ultra low-latency, low-power deep neural networks with convolutional layers on FPGAs. By extending the hls4ml library, we demonstrate an inference latency of 5μs using convolutional architectures, targeting microsecond latency applications like those at the CERN Large Hadron Collider. Considering benchmark models trained on the Street View House Numbers Dataset, we demonstrate various methods for model compression in order to fit the computational constraints of a typical FPGA device used in trigger and data acquisition systems of particle detectors. In particular, we discuss pruning and quantization-aware training, and demonstrate how resource utilization can be significantly reduced with little to no loss in model accuracy. We show that the FPGA critical resource consumption can be reduced by 97% with zero loss in model accuracy, and by 99% when tolerating a 6% accuracy degradation.

Quasi Anomalous Knowledge: Searching for new physics with embedded knowledge
Sang Eon Park, Dylan Rankin, Silviu-Marian Udrescu, Mikaeel Yunus, Philip Harris
Journal of High Energy Physics, 2021, Article 30 [ arXiv:2011.03550 | code ]

Abstract

Discoveries of new phenomena often involve a dedicated search for a hypothetical physics signature. Recently, novel deep learning techniques have emerged for anomaly detection in the absence of a signal prior. However, by ignoring signal priors, the sensitivity of these approaches is significantly reduced. We present a new strategy dubbed Quasi Anomalous Knowledge (QUAK), whereby we introduce alternative signal priors that capture some of the salient features of new physics signatures, allowing for the recovery of sensitivity even when the alternative signal is incorrect. This approach can be applied to a broad range of physics models and neural network architectures. In this paper, we apply QUAK to anomaly detection of new physics events at the CERN Large Hadron Collider utilizing variational autoencoders with normalizing flow.

Mapping Machine-Learned Physics into a Human-Readable Space
Taylor Faucett, Jesse Thaler, Daniel Whiteson
Physics Review D, 2021, Volume 103, Iss. 3 [ arXiv:2010.11998 ]

Abstract

We present a technique for translating a black-box machine-learned classifier operating on a high-dimensional input space into a small set of human-interpretable observables that can be combined to make the same classification decisions. We iteratively select these observables from a large space of high-level discriminants by finding those with the highest decision similarity relative to the black box, quantified via a metric we introduce that evaluates the relative ordering of pairs of inputs. Successive iterations focus only on the subset of input pairs that are misordered by the current set of observables. This method enables simplification of the machine-learning strategy, interpretation of the results in terms of well-understood physical concepts, validation of the physical model, and the potential for new insights into the nature of the problem itself. As a demonstration, we apply our approach to the benchmark task of jet classification in collider physics, where a convolutional neural network acting on calorimeter jet images outperforms a set of six well-known jet substructure observables. Our method maps the convolutional neural network into a set of observables called energy flow polynomials, and it closes the performance gap by identifying a class of observables with an interesting physical interpretation that has been previously overlooked in the jet substructure literature.

Enhancing searches for resonances with machine learning and moment decomposition
Ouail Kitouni, Benjamin Nachman, Constantin Weisser, and Mike Williams
Journal of High Energy Physics, 2021, Article 70 [ arXiv:2010.09745 | code ]

Abstract

A key challenge in searches for resonant new physics is that classifiers trained to enhance potential signals must not induce localized structures. Such structures could result in a false signal when the background is estimated from data using sideband methods. A variety of techniques have been developed to construct classifiers which are independent from the resonant feature (often a mass). Such strategies are sufficient to avoid localized structures, but are not necessary. We develop a new set of tools using a novel moment loss function (Moment Decomposition or MoDe) which relax the assumption of independence without creating structures in the background. By allowing classifiers to be more flexible, we enhance the sensitivity to new physics without compromising the fidelity of the background estimation.

Search for low mass vector resonances decaying into quark-antiquark pairs in proton-proton collisions at √𝑠 =13 TeV
CMS Collaboration
Physical Review D, 2019, Volume 100, Issue 11 [ arXiv:1909.04114 ]

Abstract

A search for low mass narrow vector resonances decaying into quark-antiquark pairs is presented. The analysis is based on data collected in 2017 with the CMS detector at the LHC in proton-proton collisions at a center-of-mass energy of 13 TeV, corresponding to an integrated luminosity of 41.1 fb−1. The results of this analysis are combined with those of an earlier analysis based on data collected at the same collision energy in 2016, corresponding to 35.9 fb−1. Signal candidates will be recoiling against initial state radiation and are identified as energetic, large-radius jets with two pronged substructure. The invariant jet mass spectrum is probed for a potential narrow peaking signal over a smoothly falling background. No evidence for such resonances is observed within the mass range of 50-450 GeV. Upper limits at the 95% confidence level are set on the coupling of narrow resonances to quarks, as a function of the resonance mass. For masses between 50 and 300 GeV these are the most sensitive limits to date. This analysis extends the earlier search to a mass range of 300-450 GeV, which is probed for the first time with jet substructure techniques.