arXiv Feed · Tim's Coding Blog

Recent papers

Generated automatically by GitHub Actions

A complexity phase transition at the EPR Hamiltonian

Kunal Marwaha, James Sud • 2026-04-14

We study the computational complexity of 2-local Hamiltonian problems generated by a positive-weight symmetric interaction term, encompassing many canonical problems in statistical mechanics and optimization. We show these problems belong to one of three complexity phases: QMA-complete, StoqMA-complete, and reducible to a new problem we call EPR*. The phases are physically interpretable, corresponding to the energy level ordering of the local term. The EPR* problem is a simple generalization of the EPR problem of King. Inspired by empirically efficient algorithms for EPR, we conjecture that EPR* is in BPP. If true, this would complete the complexity classification of these problems, and imply EPR* is the transition point between easy and hard local Hamiltonians. Our proofs rely on perturbative gadgets. One simple gadget, when recursed, induces a renormalization-group-like flow on the space of local interaction terms. This gives the correct complexity picture, but does not run in polynomial time. To overcome this, we design a gadget based on a large spin chain, which we analyze via the Jordan-Wigner transformation.

Classical and Quantum Speedups for Non-Convex Optimization via Energy Conserving Descent

Yihang Sun, Huaijin Wang, Patrick Hayden, Jose Blanchet • 2026-04-14

The Energy Conserving Descent (ECD) algorithm was recently proposed (De Luca & Silverstein, 2022) as a global non-convex optimization method. Unlike gradient descent, appropriately configured ECD dynamics escape strict local minima and converge to a global minimum, making it appealing for machine learning optimization. We present the first analytical study of ECD, focusing on the one-dimensional setting for this first installment. We formalize a stochastic ECD dynamics (sECD) with energy-preserving noise, as well as a quantum analog of the ECD Hamiltonian (qECD), providing the foundation for a quantum algorithm through Hamiltonian simulation. For positive double-well objectives, we compute the expected hitting time from a local to the global minimum. We prove that both sECD and qECD yield exponential speedup over respective gradient descent baselines--stochastic gradient descent and its quantization. For objectives with tall barriers, qECD achieves a further speedup over sECD.

Nonparametric efficient inference for network quantile causal effects under partial interference

Chao Cheng, Fan Li • 2026-04-14

Interference arises when the treatment assigned to one individual affects the outcomes of other individuals. Commonly, individuals are naturally grouped into clusters, and interference occurs only among individuals within the same cluster, a setting referred to as partial interference. We study network causal effects on outcome quantiles in the presence of partial interference. We develop a general nonparametric efficiency theory for estimating these network quantile causal effects, which leads to a nonparametrically efficient estimator. The proposed estimator is consistent and asymptotically normal with parametric convergence rates, while allowing for flexible, data-adaptive estimation of complex nuisance functions. We leverage a three-way cross-fitting procedure that avoids direct estimation of the conditional outcome distribution. Simulations demonstrate adequate finite-sample performance of the proposed estimators, and we apply the methods to a clustered observational study.

Cosmologically viable non-polynomial quasi-topological gravity: explicit models, $Λ$CDM limit and observational constraints

Emmanuel N. Saridakis • 2026-04-14

We investigate the cosmological implications of non-polynomial quasi-topological gravity (NPQTG), a novel class of modified gravitational theories in which the background dynamics is encoded in a single function of the Hubble parameter. This framework provides a minimal and theoretically consistent extension of general relativity, incorporating higher-curvature effects while preserving second-order field equations and avoiding higher-derivative instabilities. We first establish the general conditions for cosmological viability and construct explicit realizations, including polynomial, quartic, power-law and non-polynomial models, demonstrating how different functional forms lead to distinct expansion histories. Focusing on the quartic and power-law cases, we show that the resulting cosmological evolution reproduces the standard thermal history of the Universe and gives rise to an effective dark-energy sector of geometric origin, with dynamical equation-of-state behavior that can lie in the quintessence or phantom regime. We then confront the models with observational data from Type Ia Supernovae, Cosmic Chronometers, and Baryon Acoustic Oscillations, using a Bayesian MCMC analysis. We find that both models provide an excellent fit to the data, remaining fully compatible with current constraints and statistically competitive with $Λ$CDM. Our results demonstrate that NPQTG offers a simple and efficient framework for describing late-time cosmic acceleration with dynamical dark energy, while maintaining theoretical consistency and observational viability.

Causal Diffusion Models for Counterfactual Outcome Distributions in Longitudinal Data

Farbod Alinezhad, Jianfei Cao, Gary J. Young, Brady Post • 2026-04-14

Predicting counterfactual outcomes in longitudinal data, where sequential treatment decisions heavily depend on evolving patient states, is critical yet notoriously challenging due to complex time-dependent confounding and inadequate uncertainty quantification in existing methods. We introduce the Causal Diffusion Model (CDM), the first denoising diffusion probabilistic approach explicitly designed to generate full probabilistic distributions of counterfactual outcomes under sequential interventions. CDM employs a novel residual denoising architecture with relational self-attention, capturing intricate temporal dependencies and multimodal outcome trajectories without requiring explicit adjustments (e.g., inverse-probability weighting or adversarial balancing) for confounding. In rigorous evaluation on a pharmacokinetic-pharmacodynamic tumor-growth simulator widely adopted in prior work, CDM consistently outperforms state-of-the-art longitudinal causal inference methods, achieving a 15-30% relative improvement in distributional accuracy (1-Wasserstein distance) while maintaining competitive or superior point-estimate accuracy (RMSE) under high-confounding regimes. By unifying uncertainty quantification and robust counterfactual prediction in complex, sequentially confounded settings, without tailored deconfounding, CDM offers a flexible, high-impact tool for decision support in medicine, policy evaluation, and other longitudinal domains.

On causal inference with marked point process data

Pål Christie Ryalen, Mats Julius Stensrud, Kjetil Røysland • 2026-04-14

We define dynamic treatment regimes and associated potential outcomes for data described by marked point processes (MPPs). These definitions motivate MPP analogues of the commonly used consistency, exchangeability, and positivity conditions that are sufficient for identifying effects in MPP data structures. The conditions are formulated based on martingale theory, which allows us to derive explicit identifying assumptions for data described by stochastic processes. The definitions and conditions align with well-established discrete-time results in important special cases. Thus, this work bridges the large literatures on survival (event history) analysis with counting processes in continuous time and causal inference with variables in discrete-time. After formulating a set of identification conditions, we derive and characterize marginal g-formulas. The g-formulas are generally different from those studied in related works, though they coincide in important special cases. We relate our findings to previous work on causal inference with (counting) processes, the classical survival literature, and the discrete-time causal inference literature.

An Engineering Journey Training Large Language Models at Scale on Alps: The Apertus Experience

Jonathan Coles, Stefano Schuppli, Lukas Drescher, Fawzi Roberto Mohamed, Elia Palme, Henrique Mendonça, Miguel Gila, Mark Klein, Maxime Martinasso, Joost VandeVondele, Torsten Hoefler, Thomas Schulthess, Josh Romero, Igor Gorodetsky, Ryan Hankins, Isa Wazirzada, Martin Jaggi, Antoine Bosselut, Imanol Schlag, Antoni-Joan Solergibert i Llaquet, Alejandro Hernández Cano, Theofilos Ioannis Manitaras, Nicholas John Browning • 2026-04-14

Large Language Models (LLMs) have surged as a transformative technology for science and society, prompting governments worldwide to pursue sovereign AI capabilities that ensure data compliance and cultural representation. However, the associated capital costs and engineering complexity required to train these models have largely restricted such capabilities to the private sector, leaving a significant gap for public institutions. This paper details the engineering journey behind training \textit{Apertus}, a fully open multilingual foundation model, on the \textit{Alps} supercomputer. Representing a first-of-its-kind achievement for academia at the 70B parameter scale, we successfully deployed a massive pre-training campaign on one of Europe's largest systems for open science, powered by NVIDIA GH200 Grace Hopper Superchips. We detail the challenges encountered in readying HPC infrastructure for training AI models, from overcoming storage bottlenecks to stabilizing large-scale interconnects, and the lessons learned in transforming a supercomputer into a resilient software-defined Machine Learning Platform. Finally, we discuss the post-training requirements and evolution of our Machine Learning platform, outlining how this initial release lays the groundwork for a sustained, iterative operational capability, in particular for fine tuning foundation models, that extends well beyond a single model training run.

Cycle-Consistent Search: Question Reconstructability as a Proxy Reward for Search Agent Training

Sohyun An, Shuibenyang Yuan, Hayeon Lee, Cho-Jui Hsieh, Alexander Min • 2026-04-14

Reinforcement Learning (RL) has shown strong potential for optimizing search agents in complex information retrieval tasks. However, existing approaches predominantly rely on gold supervision, such as ground-truth answers, which is difficult to scale. To address this limitation, we propose Cycle-Consistent Search (CCS), a gold-supervision-free framework for training search agents, inspired by cycle-consistency techniques from unsupervised machine translation and image-to-image translation. Our key hypothesis is that an optimal search trajectory, unlike insufficient or irrelevant ones, serves as a lossless encoding of the question's intent. Consequently, a high-quality trajectory should preserve the information required to accurately reconstruct the original question, thereby inducing a reward signal for policy optimization. However, naive cycle-consistency objectives are vulnerable to information leakage, as reconstruction may rely on superficial lexical cues rather than the underlying search process. To reduce this effect, we apply information bottlenecks, including exclusion of the final response and named entity recognition (NER) masking of search queries. These constraints force reconstruction to rely on retrieved observations together with the structural scaffold, ensuring that the resulting reward signal reflects informational adequacy rather than linguistic redundancy. Experiments on question-answering benchmarks show that CCS achieves performance comparable to supervised baselines while outperforming prior methods that do not rely on gold supervision. These results suggest that CCS provides a scalable training paradigm for training search agents in settings where gold supervision is unavailable.

Improving Network Clock Synchronization by Marking Congestion

Yash Deshpande, Quirin Vogel, Laura Becker, Kaan Aykurt, Wolfgang Kellerer • 2026-04-14

Achieving consistent time across devices in distributed systems often involves exchanging timestamped messages over a network. Precise time synchronization is crucial for applications such as cellular networks, industrial automation, and transactional databases. However, delay variation in synchronization packets-often caused by congestion from competing traffic-degrades synchronization accuracy. Detecting whether a packet experienced congestion can help improve synchronization through filtering and statistical methods. We propose an in-network congestion indication and filtering mechanism for synchronization messages used in protocols such as the Network Time Protocol (NTP) and Precision Time Protocol (PTP). Network devices mark packets that experienced queuing, allowing clocks to correct errors caused by varying delays. Our approach requires only simple changes at switches or routers, avoiding deep packet inspection or protocol modifications. The method is backward compatible, using standard but currently unused fields in IP, PTP, or NTP headers. We implement our method on a Tofino P4 target and demonstrate an improvement of over 80% in synchronization performance over a single hop. Moreover, we show that the performance of traditional statistical filters, such as min-RTT and median-delay, is improved by 90% over the one-hop hardware setup. We further demonstrate the effectiveness of our proposed method across multiple hops, both analytically and through simulation. Congestion marking improves the root-mean-squared clock offset estimation error by 30% to 80%, depending on network conditions and filtering techniques.

Output-Feedback Safe Control of Discrete-Time Stochastic Systems with Chance Constraints

Jianing Zhao, Zhuoting Cai, Xiang Yin • 2026-04-14

In this paper, we investigate safety-critical control problem of discrete-time stochastic systems with incomplete information, where safety constraints must be enforced using state estimates obtained from noisy measurements. We develop an output-feedback control barrier function (CBF) framework based on an expectation-based discrete-time barrier condition that explicitly incorporates estimation uncertainty through the evolving belief over the state. To enable real-time implementation, we derive deterministic sufficient conditions that conservatively enforce the expectation-based CBF by bounding the expectation with computable functions of the belief statistics using Jensen inequalities. The resulting safety filter is formulated as a tractable optimization problem compatible with standard online controllers. Numerical simulations demonstrate that the proposed output-feedback approach achieves fast online computation while providing reliable safety performance in the presence of process noise and measurement uncertainty.

The Verification Tax: Fundamental Limits of AI Auditing in the Rare-Error Regime

Jason Z Wang • 2026-04-14

The most cited calibration result in deep learning -- post-temperature-scaling ECE of 0.012 on CIFAR-100 (Guo et al., 2017) -- is below the statistical noise floor. We prove this is not a failure of the experiment but a law: the minimax rate for estimating calibration error with model error rate epsilon is Theta((Lepsilon/m)^{1/3}), and no estimator can beat it. This "verification tax" implies that as AI models improve, verifying their calibration becomes fundamentally harder -- with the same exponent in opposite directions. We establish four results that contradict standard evaluation practice: (1) self-evaluation without labels provides exactly zero information about calibration, bounded by a constant independent of compute; (2) a sharp phase transition at mepsilon approx 1 below which miscalibration is undetectable; (3) active querying eliminates the Lipschitz constant, collapsing estimation to detection; (4) verification cost grows exponentially with pipeline depth at rate L^K. We validate across five benchmarks (MMLU, TruthfulQA, ARC-Challenge, HellaSwag, WinoGrande; ~27,000 items) with 6 LLMs from 5 families (8B-405B parameters, 27 benchmark-model pairs with logprob-based confidence), 95% bootstrap CIs, and permutation tests. Self-evaluation non-significance holds in 80% of pairs. Across frontier models, 23% of pairwise comparisons are indistinguishable from noise, implying that credible calibration claims must report verification floors and prioritize active querying once gains approach benchmark resolution.

Direct Discrepancy Replay: Distribution-Discrepancy Condensation and Manifold-Consistent Replay for Continual Face Forgery Detection

Tianshuo Zhang, Haoyuan Zhang, Siran Peng, Weisong Zhao, Xiangyu Zhu, Zhen Lei • 2026-04-14

Continual face forgery detection (CFFD) requires detectors to learn emerging forgery paradigms without forgetting previously seen manipulations. Existing CFFD methods commonly rely on replaying a small amount of past data to mitigate forgetting. Such replay is typically implemented either by storing a few historical samples or by synthesizing pseudo-forgeries from detector-dependent perturbations. Under strict memory budgets, the former cannot adequately cover diverse forgery cues and may expose facial identities, while the latter remains strongly tied to past decision boundaries. We argue that the core role of replay in CFFD is to reinstate the distributions of previous forgery tasks during subsequent training. To this end, we directly condense the discrepancy between real and fake distributions and leverage real faces from the current stage to perform distribution-level replay. Specifically, we introduce Distribution-Discrepancy Condensation (DDC), which models the real-to-fake discrepancy via a surrogate factorization in characteristic-function space and condenses it into a tiny bank of distribution discrepancy maps. We further propose Manifold-Consistent Replay (MCR), which synthesizes replay samples through variance-preserving composition of these maps with current-stage real faces, yielding samples that reflect previous-task forgery cues while remaining compatible with current real-face statistics. Operating under an extremely small memory budget and without directly storing raw historical face images, our framework consistently outperforms prior CFFD baselines and significantly mitigates catastrophic forgetting. Replay-level privacy analysis further suggests reduced identity leakage risk relative to selection-based replay.

Distributional Convergence of Empirical Entropic Optimal Transport and Statistical Applications

Santiago Arenas-Velilla, Axel Munk, Luis-Alberto Rodríguez • 2026-04-14

Recently, the statistical properties of empirical Entropic Optimal Transport (EOT) have attracted great interest, as this quantity has been shown to be useful for complex data analysis, among other reasons due to its computational efficiency. In several applications, it has been observed that the EOT plan provides valuable information beyond just the optimal value. For example, in cell biology, colocalization analysis based on the EOT plan has been introduced as a measure for quantification of spatial proximity of different protein assemblies. Despite recent progress in the analysis of its risk properties, a precise understanding of its statistical fluctuations to make it accessible for inference remains elusive to a large extent. In this paper, we derive asymptotic weak convergence result for a large class of functionals of the EOT plan, in which the colocalization process is included. The proof is based on Hadamard differentiability and the extended delta method. As an application, we obtain uniform confidence bands for colocalization curves and bootstrap consistency. Our theory is supported by simulation studies and is illustrated by real world data analysis from mitochondrial protein colocalization.

A Wearable ECG Device for Differentiating Hypertrophic Cardiomyopathy from Acquired Left Ventricular Hypertrophy

Jiachen Li, Hanyu Zhu, Edward Kim, Shihao Li, Katherine Cavanaugh, Arpan Patel, Sovik De Sirkar, Mauricio Hong, Wei Li, Dongmei Chen • 2026-04-14

Hypertrophic Cardiomyopathy (HCM) is a genetic heart disease affecting approximately 1 in 500 people and is the leading cause of sudden cardiac death in young athletes. Current diagnostic methods -- cardiovascular magnetic resonance (CMR), echocardiography, and genetic testing -- are limited by high costs, operator dependency, or insufficient accuracy, while standard electrocardiogram (ECG) analysis cannot reliably distinguish HCM from acquired left ventricular hypertrophy (LVH). This paper presents a wearable ECG device paired with a classification algorithm that differentiates HCM from acquired LVH using ECG signals alone. The portable device integrates a 3-lead electrode system, an AD8232 signal conditioning module, an Arduino Nano 33 BLE microcontroller, and a lithium polymer battery. The algorithm extracts two quantitative indices -- HCM Index~1 and HCM Index~2 -- from each heartbeat and classifies patients via dual statistical thresholds. Validation on 483 LVH patients (PhysioNet) and 29 HCM patients (digitized clinical records) yields 75.86\% sensitivity, 99.17\% specificity, and an F1-score of 80.00\%. Leave-one-out cross-validation confirms generalizability, with cross-validated sensitivity of 72.41\%, specificity of 98.96\%, and F1-score of 76.36\% (95\% confidence intervals reported). A digitization confound analysis demonstrates that the classification is driven by physiological cardiac features rather than data source artifacts. A simulated device acquisition chain analysis confirms that the wearable hardware's signal characteristics are compatible with the classification algorithm. The system offers a promising tool for affordable HCM screening in resource-limited settings.

Turbulent pair dispersion with Stochastic Generative Diffusion Models

Andrei Pantea, Luca Biferale, Michele Buzzicotti, Guillaume Charpiat, Sergio Chibbaro, Tianyi Li • 2026-04-14

Recent advances in data-driven modeling have shown that diffusion models can successfully generate synthetic Lagrangian trajectories in turbulent flows. Building on this progress, we extend the method to the joint generation of pairs of Lagrangian velocity trajectories, enabling a fully data-driven representation of turbulent pair dispersion, a long-standing fundamental problem with broad relevance in fluid dynamics. We demonstrate that diffusion models accurately reproduce the evolution of particle-pair separation, including deviations from Richardson's classical scaling law, while simultaneously preserving all key single-particle statistical properties reported in previous studies. These findings underscore the potential of diffusion-based generative models to emulate high-dimensional, multi-scale turbulent dynamics, further establishing them as a powerful tool for scientific modeling and for future geophysical and astrophysical applications.

Joint Clustering and Prediction of the Quality of Service in Vehicular Cellular Networks

Oscar Stenhammar, Gábor Fodor, Carlo Fischione • 2026-04-14

Machine learning models are increasingly deployed in wireless networks with stringent performance requirements. However, dynamic propagation environments and fluctuating traffic densities introduce concept drift, which complicates the ability to maintain accurate predictive machine learning models. We propose a distributed optimization framework that jointly clusters cells and trains cluster-level predictive models, enabling nodes to cooperatively predict quality of service (QoS) distributions under communication constraints. The proposed method models QoS as a multivariate Gaussian/lognormal distribution and uses a novel clustering mechanism that groups cells with similar network conditions, allowing each cell to select the most appropriate predictor without retraining new models for each cell. By leveraging block coordinate descent, our solution efficiently clusters the cells and updates the predictive models to mitigate concept drift, while maintaining a compact model set to minimize computation overhead. Evaluation using data from realistic simulations with the Sionna ray-tracer and the ns-3 simulator shows that the method converges and yields cluster constellations that adapt to changes in the network that cause concept drift. The experimental evaluation focuses on providing a prediction of the distribution latency, jitter, and RSRP over a one-hour prediction horizon. The proposed method significantly outperforms the traditional single global predictive model approach and reduces the mean absolute error by 9-27% compared to local cell-level predictors. This demonstrates that the proposed method effectively captures local variability using far fewer models through scalable distributed clustering.

Advancing Network Digital Twin Framework for Generating Realistic Datasets

Oscar Stenhammar, Sundeep Rangan, Gábor Fodor, Carlo Fischione • 2026-04-14

The integration of accurate and reproducible wireless network simulations is a key enabler for research on open, virtualized, and intelligent communication systems. Network Digital Twins (NDTs) provide a scalable alternative to costly and time-consuming measurement campaigns, while enabling controlled experimentation and data generation for data-driven network design. In this paper, we present an open and user-friendly NDT framework that integrates controllable vehicular mobility with the site-specific ray tracer Sionna and the discrete-event ns-3 network simulator, enabling virtualized end-to-end modeling of wireless networks across the radio, network, and application layers. The proposed framework is particularly well-suited for dynamic vehicular networks and urban deployments, supporting realistic mobility, traffic dynamics, and the extraction of cross-layer metrics. To promote open-source initiatives, we release both the NDT implementation and a representative dataset generated from realistic vehicular and urban scenarios. The framework and dataset facilitate reproducible experimentation and benchmarking of machine learning-based quality of service prediction, network optimization, and intelligent network management algorithms, lowering the entry barrier for research on virtual and open wireless network services.

A Causal Framework for Evaluating Jointly Longitudinal Outcomes and Surrogate Markers: A State-Space Approach

Silvaneo V. dos Santos, Layla Parast • 2026-04-14

Surrogate markers offer the potential to reduce the burden of data collection by replacing costly or invasive primary outcomes with more accessible measurements, provided that they can faithfully indicate the effectiveness of a treatment. However, appropriate evaluation of a surrogate is particularly complex in longitudinal studies, where both outcomes and surrogates can evolve dynamically over time and interest lies not only in the treatment effect at one time, but rather treatment effects that may vary along the entire trajectory. In this paper, we develop a statistical framework for surrogate evaluation when both the surrogate and primary outcome are measured over time. Specifically, within the potential outcomes framework, we propose a formal causal definition of the proportion of the treatment effect on the longitudinal primary outcome that is explained by the treatment effect on the longitudinal surrogate. For estimation, we leverage state-space models, together with the Kalman filter and smoother, enabling efficient estimation of treatment effects under realistic conditions of temporal evolution and patient-level variability. We introduce a nonparametric bootstrap strategy for state-space models, a temporal homogeneity test, and demonstrate the finite-sample performance of our proposed methods via a simulation study and application to a diabetes clinical trial.

Four Decades of Digital Waveguides

Pablo Tablas de Paula, Julius O. Smith, Vesa Välimäki, Joshua D. Reiss • 2026-04-14

Digital waveguide physical modeling offers efficient simulation of acoustic wave propagation as compared to general finite-difference schemes commonly used in computational physics. This efficiency has enabled the real-time implementation of physically modeled musical instruments and sound effects, as well as real-time vocal models and artificial reverberation. This paper provides an overview of the historical evolution and applications of digital waveguide modeling and highlights recent advances in the field. Parametric optimization using classical, evolutionary and neural approaches are also discussed and compared. Digital waveguides provide physically accurate simulations with reduced computational cost, and can now be optimized with modern machine learning and differentiable digital signal processing techniques.

Fidelity of Machine Learned Potentials: Quantitative Assessment for Protonated Oxalate

Chen Qu, Paul L. Houston, Qi Yu, Apurba Nandi, Joel M. Bowman, Valerii Andreichev, Silvan Käser, Markus Meuwly • 2026-04-14

There has been a veritable explosion of methods and software to perform machine-learned regression on datasets of electronic energies and forces to develop high-dimensional machine learned potential energy surfaces (ML-PESs). A major, but not deeply-studied aspect is how well different ML-PESs represent the same dataset on which they are trained, beyond the standard fitting precision metrics. Here, this is examined in detail using several ''stress tests'', for two widely applied machine-learned potential approaches. One is based on permutationally invariant polynomial (PIP) linear least square regression and the other is the message-passing neural network PhysNet approach. These potentials and dipole moment surfaces are used in VSCF/VCI calculations of vibrational energies and wavefunctions. The energies from the two PESs are directly compared as are the IR spectra. In addition, tunneling splittings for the hydrogen transfer between two equivalent structures are reported from using three methods: ring polymer instanton theory, diffusion Monte Carlo simulations, and the $Q_{im}$ path method. These calculations require the evaluation of on the order of one billion energies that are widely dispersed in the 15-dimensional configurational space. The two PESs yield results for these quantities in excellent agreement with each other.