Category: stat.ML

Evaluating and Learning Optimal Dynamic Treatment Regimes under Truncation by Death

Evaluating and Learning Optimal Dynamic Treatment Regimes under Truncation by Death arXiv:2510.07501v1 Announce Type: new Abstract: Truncation by death, a prevalent challenge in critical care, renders traditional dynamic treatment regime (DTR) evaluation inapplicable due to ill-defined potential outcomes. We introduce a principal stratification-based method, focusing on the always-survivor value function. We derive a semiparametrically efficient,…

October 10, 2025
From Data to Rewards: a Bilevel Optimization Perspective on Maximum Likelihood Estimation

From Data to Rewards: a Bilevel Optimization Perspective on Maximum Likelihood Estimation arXiv:2510.07624v1 Announce Type: new Abstract: Generative models form the backbone of modern machine learning, underpinning state-of-the-art systems in text, vision, and multimodal applications. While Maximum Likelihood Estimation has traditionally served as the dominant training paradigm, recent work have highlighted its limitations, particularly in…

October 10, 2025
When Robustness Meets Conservativeness: Conformalized Uncertainty Calibration for Balanced Decision Making

When Robustness Meets Conservativeness: Conformalized Uncertainty Calibration for Balanced Decision Making arXiv:2510.07750v1 Announce Type: new Abstract: Robust optimization safeguards decisions against uncertainty by optimizing against worst-case scenarios, yet their effectiveness hinges on a prespecified robustness level that is often chosen ad hoc, leading to either insufficient protection or overly conservative and costly solutions. Recent approaches…

October 10, 2025
A Honest Cross-Validation Estimator for Prediction Performance

A Honest Cross-Validation Estimator for Prediction Performance arXiv:2510.07649v1 Announce Type: new Abstract: Cross-validation is a standard tool for obtaining a honest assessment of the performance of a prediction model. The commonly used version repeatedly splits data, trains the prediction model on the training set, evaluates the model performance on the test set, and averages the…

October 10, 2025
Surrogate Graph Partitioning for Spatial Prediction

Surrogate Graph Partitioning for Spatial Prediction arXiv:2510.07832v1 Announce Type: new Abstract: Spatial prediction refers to the estimation of unobserved values from spatially distributed observations. Although recent advances have improved the capacity to model diverse observation types, adoption in practice remains limited in industries that demand interpretability. To mitigate this gap, surrogate models that explain black-box…

October 10, 2025
Online Matching via Reinforcement Learning: An Expert Policy Orchestration Strategy

Online Matching via Reinforcement Learning: An Expert Policy Orchestration Strategy arXiv:2510.06515v1 Announce Type: new Abstract: Online matching problems arise in many complex systems, from cloud services and online marketplaces to organ exchange networks, where timely, principled decisions are critical for maintaining high system performance. Traditional heuristics in these settings are simple and interpretable but typically…

October 9, 2025
A General Constructive Upper Bound on Shallow Neural Nets Complexity

A General Constructive Upper Bound on Shallow Neural Nets Complexity arXiv:2510.06372v1 Announce Type: new Abstract: We provide an upper bound on the number of neurons required in a shallow neural network to approximate a continuous function on a compact set with a given accuracy. This method, inspired by a specific proof of the Stone-Weierstrass theorem,…

October 9, 2025
Q-Learning with Fine-Grained Gap-Dependent Regret

Q-Learning with Fine-Grained Gap-Dependent Regret arXiv:2510.06647v1 Announce Type: new Abstract: We study fine-grained gap-dependent regret bounds for model-free reinforcement learning in episodic tabular Markov Decision Processes. Existing model-free algorithms achieve minimax worst-case regret, but their gap-dependent bounds remain coarse and fail to fully capture the structure of suboptimality gaps. We address this limitation by establishing…

October 9, 2025
Gaussian Equivalence for Self-Attention: Asymptotic Spectral Analysis of Attention Matrix

Gaussian Equivalence for Self-Attention: Asymptotic Spectral Analysis of Attention Matrix arXiv:2510.06685v1 Announce Type: new Abstract: Self-attention layers have become fundamental building blocks of modern deep neural networks, yet their theoretical understanding remains limited, particularly from the perspective of random matrix theory. In this work, we provide a rigorous analysis of the singular value spectrum of…

October 9, 2025
Bayesian Nonparametric Dynamical Clustering of Time Series

Bayesian Nonparametric Dynamical Clustering of Time Series arXiv:2510.06919v1 Announce Type: new Abstract: We present a method that models the evolution of an unbounded number of time series clusters by switching among an unknown number of regimes with linear dynamics. We develop a Bayesian non-parametric approach using a hierarchical Dirichlet process as a prior on the…

October 9, 2025
Minima and Critical Points of the Bethe Free Energy Are Invariant Under Deformation Retractions of Factor Graphs

Minima and Critical Points of the Bethe Free Energy Are Invariant Under Deformation Retractions of Factor Graphs arXiv:2510.05380v1 Announce Type: new Abstract: In graphical models, factor graphs, and more generally energy-based models, the interactions between variables are encoded by a graph, a hypergraph, or, in the most general case, a partially ordered set (poset). Inference…

October 8, 2025
Refereed Learning

Refereed Learning arXiv:2510.05440v1 Announce Type: new Abstract: We initiate an investigation of learning tasks in a setting where the learner is given access to two competing provers, only one of which is honest. Specifically, we consider the power of such learners in assessing purported properties of opaque models. Following prior work that considers the power…

October 8, 2025
Domain-Shift-Aware Conformal Prediction for Large Language Models

Domain-Shift-Aware Conformal Prediction for Large Language Models arXiv:2510.05566v1 Announce Type: new Abstract: Large language models have achieved impressive performance across diverse tasks. However, their tendency to produce overconfident and factually incorrect outputs, known as hallucinations, poses risks in real world applications. Conformal prediction provides finite-sample, distribution-free coverage guarantees, but standard conformal prediction breaks down under…

October 8, 2025
A Probabilistic Basis for Low-Rank Matrix Learning

A Probabilistic Basis for Low-Rank Matrix Learning arXiv:2510.05447v1 Announce Type: new Abstract: Low rank inference on matrices is widely conducted by optimizing a cost function augmented with a penalty proportional to the nuclear norm $Vert cdot Vert_*$. However, despite the assortment of computational methods for such problems, there is a surprising lack of understanding of…

October 8, 2025
Bilevel optimization for learning hyperparameters: Application to solving PDEs and inverse problems with Gaussian processes

Bilevel optimization for learning hyperparameters: Application to solving PDEs and inverse problems with Gaussian processes arXiv:2510.05568v1 Announce Type: new Abstract: Methods for solving scientific computing and inference problems, such as kernel- and neural network-based approaches for partial differential equations (PDEs), inverse problems, and supervised learning tasks, depend crucially on the choice of hyperparameters. Specifically, the…

October 8, 2025
Quantile-Scaled Bayesian Optimization Using Rank-Only Feedback

Quantile-Scaled Bayesian Optimization Using Rank-Only Feedback arXiv:2510.03277v1 Announce Type: new Abstract: Bayesian Optimization (BO) is widely used for optimizing expensive black-box functions, particularly in hyperparameter tuning. However, standard BO assumes access to precise objective values, which may be unavailable, noisy, or unreliable in real-world settings where only relative or rank-based feedback can be obtained. In…

October 7, 2025
Mathematically rigorous proofs for Shapley explanations

Mathematically rigorous proofs for Shapley explanations arXiv:2510.03281v1 Announce Type: new Abstract: Machine Learning is becoming increasingly more important in today’s world. It is therefore very important to provide understanding of the decision-making process of machine-learning models. A popular way to do this is by looking at the Shapley-Values of these models as introduced by Lundberg…

October 7, 2025
Transformed $ell_1$ Regularizations for Robust Principal Component Analysis: Toward a Fine-Grained Understanding

Transformed $ell_1$ Regularizations for Robust Principal Component Analysis: Toward a Fine-Grained Understanding arXiv:2510.03624v1 Announce Type: new Abstract: Robust Principal Component Analysis (RPCA) aims to recover a low-rank structure from noisy, partially observed data that is also corrupted by sparse, potentially large-magnitude outliers. Traditional RPCA models rely on convex relaxations, such as nuclear norm and $ell_1$…

October 7, 2025
The analogy theorem in Hoare logic

The analogy theorem in Hoare logic arXiv:2510.03685v1 Announce Type: new Abstract: The introduction of machine learning methods has led to significant advances in automation, optimization, and discoveries in various fields of science and technology. However, their widespread application faces a fundamental limitation: the transfer of models between data domains generally lacks a rigorous mathematical justification.…

October 7, 2025
Spectral Thresholds for Identifiability and Stability:Finite-Sample Phase Transitions in High-Dimensional Learning

Spectral Thresholds for Identifiability and Stability:Finite-Sample Phase Transitions in High-Dimensional Learning arXiv:2510.03809v1 Announce Type: new Abstract: In high-dimensional learning, models remain stable until they collapse abruptly once the sample size falls below a critical level. This instability is not algorithm-specific but a geometric mechanism: when the weakest Fisher eigendirection falls beneath sample-level fluctuations, identifiability fails.…

October 7, 2025
Higher-arity PAC learning, VC dimension and packing lemma

Higher-arity PAC learning, VC dimension and packing lemma arXiv:2510.02420v1 Announce Type: new Abstract: The aim of this note is to overview some of our work in Chernikov, Towsner’20 (arXiv:2010.00726) developing higher arity VC theory (VC$_n$ dimension), including a generalization of Haussler packing lemma, and an associated tame (slice-wise) hypergraph regularity lemma; and to demonstrate that…

October 6, 2025
Predictive inference for time series: why is split conformal effective despite temporal dependence?

Predictive inference for time series: why is split conformal effective despite temporal dependence? arXiv:2510.02471v1 Announce Type: new Abstract: We consider the problem of uncertainty quantification for prediction in a time series: if we use past data to forecast the next time point, can we provide valid prediction intervals around our forecasts? To avoid placing distributional…

October 6, 2025
Beyond Linear Diffusions: Improved Representations for Rare Conditional Generative Modeling

Beyond Linear Diffusions: Improved Representations for Rare Conditional Generative Modeling arXiv:2510.02499v1 Announce Type: new Abstract: Diffusion models have emerged as powerful generative frameworks with widespread applications across machine learning and artificial intelligence systems. While current research has predominantly focused on linear diffusions, these approaches can face significant challenges when modeling a conditional distribution, $P(Y|X=x)$, when…

October 6, 2025
Adaptive randomized pivoting and volume sampling

Adaptive randomized pivoting and volume sampling arXiv:2510.02513v1 Announce Type: new Abstract: Adaptive randomized pivoting (ARP) is a recently proposed and highly effective algorithm for column subset selection. This paper reinterprets the ARP algorithm by drawing connections to the volume sampling distribution and active learning algorithms for linear regression. As consequences, this paper presents new analysis…

October 6, 2025
Learning Multi-Index Models with Hyper-Kernel Ridge Regression

Learning Multi-Index Models with Hyper-Kernel Ridge Regression arXiv:2510.02532v1 Announce Type: new Abstract: Deep neural networks excel in high-dimensional problems, outperforming models such as kernel methods, which suffer from the curse of dimensionality. However, the theoretical foundations of this success remain poorly understood. We follow the idea that the compositional structure of the learning task is…

October 6, 2025
Private Realizable-to-Agnostic Transformation with Near-Optimal Sample Complexity

Private Realizable-to-Agnostic Transformation with Near-Optimal Sample Complexity arXiv:2510.01291v1 Announce Type: new Abstract: The realizable-to-agnostic transformation (Beimel et al., 2015; Alon et al., 2020) provides a general mechanism to convert a private learner in the realizable setting (where the examples are labeled by some function in the concept class) to a private learner in the agnostic…

October 3, 2025
Continuously Augmented Discrete Diffusion model for Categorical Generative Modeling

Continuously Augmented Discrete Diffusion model for Categorical Generative Modeling arXiv:2510.01329v1 Announce Type: new Abstract: Standard discrete diffusion models treat all unobserved states identically by mapping them to an absorbing [MASK] token. This creates an ‘information void’ where semantic information that could be inferred from unmasked tokens is lost between denoising steps. We introduce Continuously Augmented…

October 3, 2025
Risk Phase Transitions in Spiked Regression: Alignment Driven Benign and Catastrophic Overfitting

Risk Phase Transitions in Spiked Regression: Alignment Driven Benign and Catastrophic Overfitting arXiv:2510.01414v1 Announce Type: new Abstract: This paper analyzes the generalization error of minimum-norm interpolating solutions in linear regression using spiked covariance data models. The paper characterizes how varying spike strengths and target-spike alignments can affect risk, especially in overparameterized settings. The study presents…

October 3, 2025
A reproducible comparative study of categorical kernels for Gaussian process regression, with new clustering-based nested kernels

A reproducible comparative study of categorical kernels for Gaussian process regression, with new clustering-based nested kernels arXiv:2510.01840v1 Announce Type: new Abstract: Designing categorical kernels is a major challenge for Gaussian process regression with continuous and categorical inputs. Despite previous studies, it is difficult to identify a preferred method, either because the evaluation metrics, the optimization…

October 3, 2025
AI Foundation Model for Time Series with Innovations Representation

AI Foundation Model for Time Series with Innovations Representation arXiv:2510.01560v1 Announce Type: new Abstract: This paper introduces an Artificial Intelligence (AI) foundation model for time series in engineering applications, where causal operations are required for real-time monitoring and control. Since engineering time series are governed by physical, rather than linguistic, laws, large-language-model-based AI foundation models…

October 3, 2025
Identifying All {epsilon}-Best Arms in (Misspecified) Linear Bandits

Identifying All {epsilon}-Best Arms in (Misspecified) Linear Bandits arXiv:2510.00073v1 Announce Type: new Abstract: Motivated by the need to efficiently identify multiple candidates in high trial-and-error cost tasks such as drug discovery, we propose a near-optimal algorithm to identify all {epsilon}-best arms (i.e., those at most {epsilon} worse than the optimum). Specifically, we introduce LinFACT, an…

October 2, 2025
Private Learning of Littlestone Classes, Revisited

Private Learning of Littlestone Classes, Revisited arXiv:2510.00076v1 Announce Type: new Abstract: We consider online and PAC learning of Littlestone classes subject to the constraint of approximate differential privacy. Our main result is a private learner to online-learn a Littlestone class with a mistake bound of $tilde{O}(d^{9.5}cdot log(T))$ in the realizable case, where $d$ denotes the…

October 2, 2025
CINDES: Classification induced neural density estimator and simulator

CINDES: Classification induced neural density estimator and simulator arXiv:2510.00367v1 Announce Type: new Abstract: Neural network-based methods for (un)conditional density estimation have recently gained substantial attention, as various neural density estimators have outperformed classical approaches in real-data experiments. Despite these empirical successes, implementation can be challenging due to the need to ensure non-negativity and unit-mass constraints,…

October 2, 2025
On the Adversarial Robustness of Learning-based Conformal Novelty Detection

On the Adversarial Robustness of Learning-based Conformal Novelty Detection arXiv:2510.00463v1 Announce Type: new Abstract: This paper studies the adversarial robustness of conformal novelty detection. In particular, we focus on AdaDetect, a powerful learning-based framework for novelty detection with finite-sample false discovery rate (FDR) control. While AdaDetect provides rigorous statistical guarantees under benign conditions, its behavior…

October 2, 2025
A universal compression theory: Lottery ticket hypothesis and superpolynomial scaling laws

A universal compression theory: Lottery ticket hypothesis and superpolynomial scaling laws arXiv:2510.00504v1 Announce Type: new Abstract: When training large-scale models, the performance typically scales with the number of parameters and the dataset size according to a slow power law. A fundamental theoretical and practical question is whether comparable performance can be achieved with significantly smaller…

October 2, 2025
Neural Optimal Transport Meets Multivariate Conformal Prediction

Neural Optimal Transport Meets Multivariate Conformal Prediction arXiv:2509.25444v1 Announce Type: new Abstract: We propose a framework for conditional vector quantile regression (CVQR) that combines neural optimal transport with amortized optimization, and apply it to multivariate conformal prediction. Classical quantile regression does not extend naturally to multivariate responses, while existing approaches often ignore the geometry of…

October 1, 2025
Fair Classification by Direct Intervention on Operating Characteristics

Fair Classification by Direct Intervention on Operating Characteristics arXiv:2509.25481v1 Announce Type: new Abstract: We develop new classifiers under group fairness in the attribute-aware setting for binary classification with multiple group fairness constraints (e.g., demographic parity (DP), equalized odds (EO), and predictive parity (PP)). We propose a novel approach, applicable to linear fractional constraints, based on…

October 1, 2025
Conservative Decisions with Risk Scores

Conservative Decisions with Risk Scores arXiv:2509.25588v1 Announce Type: new Abstract: In binary classification applications, conservative decision-making that allows for abstention can be advantageous. To this end, we introduce a novel approach that determines the optimal cutoff interval for risk scores, which can be directly available or derived from fitted models. Within this interval, the algorithm…

October 1, 2025
One-shot Conditional Sampling: MMD meets Nearest Neighbors

One-shot Conditional Sampling: MMD meets Nearest Neighbors arXiv:2509.25507v1 Announce Type: new Abstract: How can we generate samples from a conditional distribution that we never fully observe? This question arises across a broad range of applications in both modern machine learning and classical statistics, including image post-processing in computer vision, approximate posterior sampling in simulation-based inference,…

October 1, 2025
Coupling Generative Modeling and an Autoencoder with the Causal Bridge

Coupling Generative Modeling and an Autoencoder with the Causal Bridge arXiv:2509.25599v1 Announce Type: new Abstract: We consider inferring the causal effect of a treatment (intervention) on an outcome of interest in situations where there is potentially an unobserved confounder influencing both the treatment and the outcome. This is achievable by assuming access to two separate…

October 1, 2025
Variance-Bounded Evaluation without Ground Truth: VB-Score

Variance-Bounded Evaluation without Ground Truth: VB-Score arXiv:2509.22751v1 Announce Type: new Abstract: Reliable evaluation is a central challenge in machine learning when tasks lack ground truth labels or involve ambiguity and noise. Conventional frameworks, rooted in the Cranfield paradigm and label-based metrics, fail in such cases because they cannot assess how robustly a system performs under…

September 30, 2025
Concept activation vectors: a unifying view and adversarial attacks

Concept activation vectors: a unifying view and adversarial attacks arXiv:2509.22755v1 Announce Type: new Abstract: Concept Activation Vectors (CAVs) are a tool from explainable AI, offering a promising approach for understanding how human-understandable concepts are encoded in a model’s latent spaces. They are computed from hidden-layer activations of inputs belonging either to a concept class or…

September 30, 2025
Identifying Memory Effects in Epidemics via a Fractional SEIRD Model and Physics-Informed Neural Networks

Identifying Memory Effects in Epidemics via a Fractional SEIRD Model and Physics-Informed Neural Networks arXiv:2509.22760v1 Announce Type: new Abstract: We develop a physics-informed neural network (PINN) framework for parameter estimation in fractional-order SEIRD epidemic models. By embedding the Caputo fractional derivative into the network residuals via the L1 discretization scheme, our method simultaneously reconstructs epidemic…

September 30, 2025
A theoretical guarantee for SyncRank

A theoretical guarantee for SyncRank arXiv:2509.22766v1 Announce Type: new Abstract: We present a theoretical and empirical analysis of the SyncRank algorithm for recovering a global ranking from noisy pairwise comparisons. By adopting a complex-valued data model where the true ranking is encoded in the phases of a unit-modulus vector, we establish a sharp non-asymptotic recovery…

September 30, 2025
Differentially Private Two-Stage Gradient Descent for Instrumental Variable Regression

Differentially Private Two-Stage Gradient Descent for Instrumental Variable Regression arXiv:2509.22794v1 Announce Type: new Abstract: We study instrumental variable regression (IVaR) under differential privacy constraints. Classical IVaR methods (like two-stage least squares regression) rely on solving moment equations that directly use sensitive covariates and instruments, creating significant risks of privacy leakage and posing challenges in designing…

September 30, 2025
Near-Optimal Experiment Design in Linear non-Gaussian Cyclic Models

Near-Optimal Experiment Design in Linear non-Gaussian Cyclic Models arXiv:2509.21423v1 Announce Type: new Abstract: We study the problem of causal structure learning from a combination of observational and interventional data generated by a linear non-Gaussian structural equation model that might contain cycles. Recent results show that using mere observational data identifies the causal graph only up…

September 29, 2025
General Pruning Criteria for Fast SBL

General Pruning Criteria for Fast SBL arXiv:2509.21572v1 Announce Type: new Abstract: Sparse Bayesian learning (SBL) associates to each weight in the underlying linear model a hyperparameter by assuming that each weight is Gaussian distributed with zero mean and precision (inverse variance) equal to its associated hyperparameter. The method estimates the hyperparameters by marginalizing out the…

September 29, 2025
IndiSeek learns information-guided disentangled representations

IndiSeek learns information-guided disentangled representations arXiv:2509.21584v1 Announce Type: new Abstract: Learning disentangled representations is a fundamental task in multi-modal learning. In modern applications such as single-cell multi-omics, both shared and modality-specific features are critical for characterizing cell states and supporting downstream analyses. Ideally, modality-specific features should be independent of shared ones while also capturing all…

September 29, 2025
Effective continuous equations for adaptive SGD: a stochastic analysis view

Effective continuous equations for adaptive SGD: a stochastic analysis view arXiv:2509.21614v1 Announce Type: new Abstract: We present a theoretical analysis of some popular adaptive Stochastic Gradient Descent (SGD) methods in the small learning rate regime. Using the stochastic modified equations framework introduced by Li et al., we derive effective continuous stochastic dynamics for these methods.…

September 29, 2025
SADA: Safe and Adaptive Inference with Multiple Black-Box Predictions

SADA: Safe and Adaptive Inference with Multiple Black-Box Predictions arXiv:2509.21707v1 Announce Type: new Abstract: Real-world applications often face scarce labeled data due to the high cost and time requirements of gold-standard experiments, whereas unlabeled data are typically abundant. With the growing adoption of machine learning techniques, it has become increasingly feasible to generate multiple predicted…

September 29, 2025
Sample completion, structured correlation, and Netflix problems

Sample completion, structured correlation, and Netflix problems arXiv:2509.20404v1 Announce Type: new Abstract: We develop a new high-dimensional statistical learning model which can take advantage of structured correlation in data even in the presence of randomness. We completely characterize learnability in this model in terms of VCN${}_{k,k}$-dimension (essentially $k$-dependence from Shelah’s classification theory). This model suggests…

September 26, 2025
Fast Estimation of Wasserstein Distances via Regression on Sliced Wasserstein Distances

Fast Estimation of Wasserstein Distances via Regression on Sliced Wasserstein Distances arXiv:2509.20508v1 Announce Type: new Abstract: We address the problem of efficiently computing Wasserstein distances for multiple pairs of distributions drawn from a meta-distribution. To this end, we propose a fast estimation method based on regressing Wasserstein distance on sliced Wasserstein (SW) distances. Specifically, we…

September 26, 2025
Unsupervised Domain Adaptation with an Unobservable Source Subpopulation

Unsupervised Domain Adaptation with an Unobservable Source Subpopulation arXiv:2509.20587v1 Announce Type: new Abstract: We study an unsupervised domain adaptation problem where the source domain consists of subpopulations defined by the binary label $Y$ and a binary background (or environment) $A$. We focus on a challenging setting in which one such subpopulation in the source domain…

September 26, 2025
A Hierarchical Variational Graph Fused Lasso for Recovering Relative Rates in Spatial Compositional Data

A Hierarchical Variational Graph Fused Lasso for Recovering Relative Rates in Spatial Compositional Data arXiv:2509.20636v1 Announce Type: new Abstract: The analysis of spatial data from biological imaging technology, such as imaging mass spectrometry (IMS) or imaging mass cytometry (IMC), is challenging because of a competitive sampling process which convolves signals from molecules in a single…

September 26, 2025
A Gapped Scale-Sensitive Dimension and Lower Bounds for Offset Rademacher Complexity

A Gapped Scale-Sensitive Dimension and Lower Bounds for Offset Rademacher Complexity arXiv:2509.20618v1 Announce Type: new Abstract: We study gapped scale-sensitive dimensions of a function class in both sequential and non-sequential settings. We demonstrate that covering numbers for any uniformly bounded class are controlled above by these gapped dimensions, generalizing the results of cite{anthony2000function,alon1997scale}. Moreover, we…

September 26, 2025
Stochastic Path Planning in Correlated Obstacle Fields

Stochastic Path Planning in Correlated Obstacle Fields arXiv:2509.19559v1 Announce Type: new Abstract: We introduce the Stochastic Correlated Obstacle Scene (SCOS) problem, a navigation setting with spatially correlated obstacles of uncertain blockage status, realistically constrained sensors that provide noisy readings and costly disambiguation. Modeling the spatial correlation with Gaussian Random Field (GRF), we develop Bayesian belief…

September 25, 2025
Anchored Langevin Algorithms

Anchored Langevin Algorithms arXiv:2509.19455v1 Announce Type: new Abstract: Standard first-order Langevin algorithms such as the unadjusted Langevin algorithm (ULA) are obtained by discretizing the Langevin diffusion and are widely used for sampling in machine learning because they scale to high dimensions and large datasets. However, they face two key limitations: (i) they require differentiable log-densities,…

September 25, 2025
MAGIC: Multi-task Gaussian process for joint imputation and classification in healthcare time series

MAGIC: Multi-task Gaussian process for joint imputation and classification in healthcare time series arXiv:2509.19577v1 Announce Type: new Abstract: Time series analysis has emerged as an important tool for improving patient diagnosis and management in healthcare applications. However, these applications commonly face two critical challenges: time misalignment and data sparsity. Traditional approaches address these issues through…

September 25, 2025
Diffusion and Flow-based Copulas: Forgetting and Remembering Dependencies

Diffusion and Flow-based Copulas: Forgetting and Remembering Dependencies arXiv:2509.19707v1 Announce Type: new Abstract: Copulas are a fundamental tool for modelling multivariate dependencies in data, forming the method of choice in diverse fields and applications. However, the adoption of existing models for multimodal and high-dimensional dependencies is hindered by restrictive assumptions and poor scaling. In this…

September 25, 2025
Convex Regression with a Penalty

Convex Regression with a Penalty arXiv:2509.19788v1 Announce Type: new Abstract: A common way to estimate an unknown convex regression function $f_0: Omega subset mathbb{R}^d rightarrow mathbb{R}$ from a set of $n$ noisy observations is to fit a convex function that minimizes the sum of squared errors. However, this estimator is known for its tendency to…

September 25, 2025
Low-Rank Adaptation of Evolutionary Deep Neural Networks for Efficient Learning of Time-Dependent PDEs

Low-Rank Adaptation of Evolutionary Deep Neural Networks for Efficient Learning of Time-Dependent PDEs arXiv:2509.16395v1 Announce Type: new Abstract: We study the Evolutionary Deep Neural Network (EDNN) framework for accelerating numerical solvers of time-dependent partial differential equations (PDEs). We introduce a Low-Rank Evolutionary Deep Neural Network (LR-EDNN), which constrains parameter evolution to a low-rank subspace, thereby…

September 23, 2025
Conditional Multidimensional Scaling with Incomplete Conditioning Data

Conditional Multidimensional Scaling with Incomplete Conditioning Data arXiv:2509.16627v1 Announce Type: new Abstract: Conditional multidimensional scaling seeks for a low-dimensional configuration from pairwise dissimilarities, in the presence of other known features. By taking advantage of available data of the known features, conditional multidimensional scaling improves the estimation quality of the low-dimensional configuration and simplifies knowledge discovery…

September 23, 2025
System-Level Uncertainty Quantification with Multiple Machine Learning Models: A Theoretical Framework

System-Level Uncertainty Quantification with Multiple Machine Learning Models: A Theoretical Framework arXiv:2509.16663v1 Announce Type: new Abstract: ML models have errors when used for predictions. The errors are unknown but can be quantified by model uncertainty. When multiple ML models are trained using the same training points, their model uncertainties may be statistically dependent. In reality,…

September 23, 2025
DoubleGen: Debiased Generative Modeling of Counterfactuals

DoubleGen: Debiased Generative Modeling of Counterfactuals arXiv:2509.16842v1 Announce Type: new Abstract: Generative models for counterfactual outcomes face two key sources of bias. Confounding bias arises when approaches fail to account for systematic differences between those who receive the intervention and those who do not. Misspecification bias arises when methods attempt to address confounding through estimation…

September 23, 2025
Risk Comparisons in Linear Regression: Implicit Regularization Dominates Explicit Regularization

Risk Comparisons in Linear Regression: Implicit Regularization Dominates Explicit Regularization arXiv:2509.17251v1 Announce Type: new Abstract: Existing theory suggests that for linear regression problems categorized by capacity and source conditions, gradient descent (GD) is always minimax optimal, while both ridge regression and online stochastic gradient descent (SGD) are polynomially suboptimal for certain categories of such problems.…

September 23, 2025
SETrLUSI: Stochastic Ensemble Multi-Source Transfer Learning Using Statistical Invariant

SETrLUSI: Stochastic Ensemble Multi-Source Transfer Learning Using Statistical Invariant arXiv:2509.15593v1 Announce Type: new Abstract: In transfer learning, a source domain often carries diverse knowledge, and different domains usually emphasize different types of knowledge. Different from handling only a single type of knowledge from all domains in traditional transfer learning methods, we introduce an ensemble learning…

September 22, 2025
Phase Transition for Stochastic Block Model with more than $sqrt{n}$ Communities

Phase Transition for Stochastic Block Model with more than $sqrt{n}$ Communities arXiv:2509.15822v1 Announce Type: new Abstract: Predictions from statistical physics postulate that recovery of the communities in Stochastic Block Model (SBM) is possible in polynomial time above, and only above, the Kesten-Stigum (KS) threshold. This conjecture has given rise to a rich literature, proving that…

September 22, 2025
Interpretable Network-assisted Random Forest+

Interpretable Network-assisted Random Forest+ arXiv:2509.15611v1 Announce Type: new Abstract: Machine learning algorithms often assume that training samples are independent. When data points are connected by a network, the induced dependency between samples is both a challenge, reducing effective sample size, and an opportunity to improve prediction by leveraging information from network neighbors. Multiple methods taking…

September 22, 2025
Model-free algorithms for fast node clustering in SBM type graphs and application to social role inference in animals

Model-free algorithms for fast node clustering in SBM type graphs and application to social role inference in animals arXiv:2509.15989v1 Announce Type: new Abstract: We propose a novel family of model-free algorithms for node clustering and parameter inference in graphs generated from the Stochastic Block Model (SBM), a fundamental framework in community detection. Drawing inspiration from…

September 22, 2025
What is a good matching of probability measures? A counterfactual lens on transport maps

What is a good matching of probability measures? A counterfactual lens on transport maps arXiv:2509.16027v1 Announce Type: new Abstract: Coupling probability measures lies at the core of many problems in statistics and machine learning, from domain adaptation to transfer learning and causal inference. Yet, even when restricted to deterministic transports, such couplings are not identifiable:…

September 22, 2025
Towards universal property prediction in Cartesian space: TACE is all you need

Towards universal property prediction in Cartesian space: TACE is all you need arXiv:2509.14961v1 Announce Type: new Abstract: Machine learning has revolutionized atomistic simulations and materials science, yet current approaches often depend on spherical-harmonic representations. Here we introduce the Tensor Atomic Cluster Expansion and Tensor Moment Potential, the first unified framework formulated entirely in Cartesian space…

September 19, 2025
Benefits of Online Tilted Empirical Risk Minimization: A Case Study of Outlier Detection and Robust Regression

Benefits of Online Tilted Empirical Risk Minimization: A Case Study of Outlier Detection and Robust Regression arXiv:2509.15141v1 Announce Type: new Abstract: Empirical Risk Minimization (ERM) is a foundational framework for supervised learning but primarily optimizes average-case performance, often neglecting fairness and robustness considerations. Tilted Empirical Risk Minimization (TERM) extends ERM by introducing an exponential tilt…

September 19, 2025
Learning Rate Should Scale Inversely with High-Order Data Moments in High-Dimensional Online Independent Component Analysis

Learning Rate Should Scale Inversely with High-Order Data Moments in High-Dimensional Online Independent Component Analysis arXiv:2509.15127v1 Announce Type: new Abstract: We investigate the impact of high-order moments on the learning dynamics of an online Independent Component Analysis (ICA) algorithm under a high-dimensional data model composed of a weighted sum of two non-Gaussian random variables. This…

September 19, 2025
Next-Depth Lookahead Tree

Next-Depth Lookahead Tree arXiv:2509.15143v1 Announce Type: new Abstract: This paper proposes the Next-Depth Lookahead Tree (NDLT), a single-tree model designed to improve performance by evaluating node splits not only at the node being optimized but also by evaluating the quality of the next depth level. Jaeho Lee, Kangjin Kim, Gyeong Taek Lee Go to original…

September 19, 2025
Asymptotic Study of In-context Learning with Random Transformers through Equivalent Models

Asymptotic Study of In-context Learning with Random Transformers through Equivalent Models arXiv:2509.15152v1 Announce Type: new Abstract: We study the in-context learning (ICL) capabilities of pretrained Transformers in the setting of nonlinear regression. Specifically, we focus on a random Transformer with a nonlinear MLP head where the first layer is randomly initialized and fixed while the…

September 19, 2025
On the Rate of Gaussian Approximation for Linear Regression Problems

On the Rate of Gaussian Approximation for Linear Regression Problems arXiv:2509.14039v1 Announce Type: new Abstract: In this paper, we consider the problem of Gaussian approximation for the online linear regression task. We derive the corresponding rates for the setting of a constant learning rate and study the explicit dependence of the convergence rate upon the…

September 18, 2025
Field of View Enhanced Signal Dependent Binauralization with Mixture of Experts Framework for Continuous Source Motion

Field of View Enhanced Signal Dependent Binauralization with Mixture of Experts Framework for Continuous Source Motion arXiv:2509.13548v1 Announce Type: cross Abstract: We propose a novel mixture of experts framework for field-of-view enhancement in binaural signal matching. Our approach enables dynamic spatial audio rendering that adapts to continuous talker motion, allowing users to emphasize or suppress…

September 18, 2025
Imputation-Powered Inference

Imputation-Powered Inference arXiv:2509.13778v1 Announce Type: cross Abstract: Modern multi-modal and multi-site data frequently suffer from blockwise missingness, where subsets of features are missing for groups of individuals, creating complex patterns that challenge standard inference methods. Existing approaches have critical limitations: complete-case analysis discards informative data and is potentially biased; doubly robust estimators for non-monotone missingness-where…

September 18, 2025
Towards a Physics Foundation Model

Towards a Physics Foundation Model arXiv:2509.13805v1 Announce Type: cross Abstract: Foundation models have revolutionized natural language processing through a “train once, deploy anywhere” paradigm, where a single pre-trained model adapts to countless downstream tasks without retraining. Access to a Physics Foundation Model (PFM) would be transformative — democratizing access to high-fidelity simulations, accelerating scientific discovery,…

September 18, 2025
Holdout cross-validation for large non-Gaussian covariance matrix estimation using Weingarten calculus

Holdout cross-validation for large non-Gaussian covariance matrix estimation using Weingarten calculus arXiv:2509.13923v1 Announce Type: cross Abstract: Cross-validation is one of the most widely used methods for model selection and evaluation; its efficiency for large covariance matrix estimation appears robust in practice, but little is known about the theoretical behavior of its error. In this paper,…

September 18, 2025
PBPK-iPINNs : Inverse Physics-Informed Neural Networks for Physiologically Based Pharmacokinetic Brain Models

PBPK-iPINNs : Inverse Physics-Informed Neural Networks for Physiologically Based Pharmacokinetic Brain Models arXiv:2509.12666v1 Announce Type: new Abstract: Physics-Informed Neural Networks (PINNs) leverage machine learning with differential equations to solve direct and inverse problems, ensuring predictions follow physical laws. Physiologically based pharmacokinetic (PBPK) modeling advances beyond classical compartmental approaches by using a mechanistic, physiology focused framework.…

September 17, 2025
SURGIN: SURrogate-guided Generative INversion for subsurface multiphase flow with quantified uncertainty

SURGIN: SURrogate-guided Generative INversion for subsurface multiphase flow with quantified uncertainty arXiv:2509.13189v1 Announce Type: new Abstract: We present a direct inverse modeling method named SURGIN, a SURrogate-guided Generative INversion framework tailed for subsurface multiphase flow data assimilation. Unlike existing inversion methods that require adaptation for each new observational configuration, SURGIN features a zero-shot conditional generation…

September 17, 2025
Jackknife Variance Estimation for H’ajek-Dominated Generalized U-Statistics

Jackknife Variance Estimation for H’ajek-Dominated Generalized U-Statistics arXiv:2509.12356v1 Announce Type: cross Abstract: We prove ratio-consistency of the jackknife variance estimator, and certain variants, for a broad class of generalized U-statistics whose variance is asymptotically dominated by their H’ajek projection, with the classical fixed-order case recovered as a special instance. This H’ajek projection dominance condition unifies…

September 17, 2025
Causal-Symbolic Meta-Learning (CSML): Inducing Causal World Models for Few-Shot Generalization

Causal-Symbolic Meta-Learning (CSML): Inducing Causal World Models for Few-Shot Generalization arXiv:2509.12387v1 Announce Type: cross Abstract: Modern deep learning models excel at pattern recognition but remain fundamentally limited by their reliance on spurious correlations, leading to poor generalization and a demand for massive datasets. We argue that a key ingredient for human-like intelligence-robust, sample-efficient learning-stems from…

September 17, 2025
Reduced Order Modeling of Energetic Materials Using Physics-Aware Recurrent Convolutional Neural Networks in a Latent Space (LatentPARC)

Reduced Order Modeling of Energetic Materials Using Physics-Aware Recurrent Convolutional Neural Networks in a Latent Space (LatentPARC) arXiv:2509.12401v1 Announce Type: cross Abstract: Physics-aware deep learning (PADL) has gained popularity for use in complex spatiotemporal dynamics (field evolution) simulations, such as those that arise frequently in computational modeling of energetic materials (EM). Here, we show that…

September 17, 2025
Variable Selection Using Relative Importance Rankings

Variable Selection Using Relative Importance Rankings arXiv:2509.10853v1 Announce Type: new Abstract: Although conceptually related, variable selection and relative importance (RI) analysis have been treated quite differently in the literature. While RI is typically used for post-hoc model explanation, this paper explores its potential for variable ranking and filter-based selection before model creation. Specifically, we anticipate…

September 16, 2025
Kernel-based Stochastic Approximation Framework for Nonlinear Operator Learning

Kernel-based Stochastic Approximation Framework for Nonlinear Operator Learning arXiv:2509.11070v1 Announce Type: new Abstract: We develop a stochastic approximation framework for learning nonlinear operators between infinite-dimensional spaces utilizing general Mercer operator-valued kernels. Our framework encompasses two key classes: (i) compact kernels, which admit discrete spectral decompositions, and (ii) diagonal kernels of the form $K(x,x’)=k(x,x’)T$, where $k$…

September 16, 2025
Maximum diversity, weighting and invariants of time series

Maximum diversity, weighting and invariants of time series arXiv:2509.11146v1 Announce Type: new Abstract: Magnitude, obtained as a special case of Euler characteristic of enriched category, represents a sense of the size of metric spaces and is related to classical notions such as cardinality, dimension, and volume. While the studies have explained the meaning of magnitude…

September 16, 2025
Predictable Compression Failures: Why Language Models Actually Hallucinate

Predictable Compression Failures: Why Language Models Actually Hallucinate arXiv:2509.11208v1 Announce Type: new Abstract: Large language models perform near-Bayesian inference yet violate permutation invariance on exchangeable data. We resolve this by showing transformers minimize expected conditional description length (cross-entropy) over orderings, $mathbb{E}_pi[ell(Y mid Gamma_pi(X))]$, which admits a Kolmogorov-complexity interpretation up to additive constants, rather than the…

September 16, 2025
Contrastive Network Representation Learning

Contrastive Network Representation Learning arXiv:2509.11316v1 Announce Type: new Abstract: Network representation learning seeks to embed networks into a low-dimensional space while preserving the structural and semantic properties, thereby facilitating downstream tasks such as classification, trait prediction, edge identification, and community detection. Motivated by challenges in brain connectivity data analysis that is characterized by subject-specific, high-dimensional,…

September 16, 2025
An Information-Theoretic Framework for Credit Risk Modeling: Unifying Industry Practice with Statistical Theory for Fair and Interpretable Scorecards

An Information-Theoretic Framework for Credit Risk Modeling: Unifying Industry Practice with Statistical Theory for Fair and Interpretable Scorecards arXiv:2509.09855v1 Announce Type: new Abstract: Credit risk modeling relies extensively on Weight of Evidence (WoE) and Information Value (IV) for feature engineering, and Population Stability Index (PSI) for drift monitoring, yet their theoretical foundations remain disconnected. We…

September 15, 2025
Repulsive Monte Carlo on the sphere for the sliced Wasserstein distance

Repulsive Monte Carlo on the sphere for the sliced Wasserstein distance arXiv:2509.10166v1 Announce Type: new Abstract: In this paper, we consider the problem of computing the integral of a function on the unit sphere, in any dimension, using Monte Carlo methods. Although the methods we present are general, our guiding thread is the sliced Wasserstein…

September 15, 2025
Why does your graph neural network fail on some graphs? Insights from exact generalisation error

Why does your graph neural network fail on some graphs? Insights from exact generalisation error arXiv:2509.10337v1 Announce Type: new Abstract: Graph Neural Networks (GNNs) are widely used in learning on graph-structured data, yet a principled understanding of why they succeed or fail remains elusive. While prior works have examined architectural limitations such as over-smoothing and…

September 15, 2025
Differentially Private Decentralized Dataset Synthesis Through Randomized Mixing with Correlated Noise

Differentially Private Decentralized Dataset Synthesis Through Randomized Mixing with Correlated Noise arXiv:2509.10385v1 Announce Type: new Abstract: In this work, we explore differentially private synthetic data generation in a decentralized-data setting by building on the recently proposed Differentially Private Class-Centric Data Aggregation (DP-CDA). DP-CDA synthesizes data in a centralized setting by mixing multiple randomly-selected samples from…

September 15, 2025
Sparse Polyak: an adaptive step size rule for high-dimensional M-estimation

Sparse Polyak: an adaptive step size rule for high-dimensional M-estimation arXiv:2509.09802v1 Announce Type: cross Abstract: We propose and study Sparse Polyak, a variant of Polyak’s adaptive step size, designed to solve high-dimensional statistical estimation problems where the problem dimension is allowed to grow much faster than the sample size. In such settings, the standard Polyak…

September 15, 2025
Global Optimization of Stochastic Black-Box Functions with Arbitrary Noise Distributions using Wilson Score Kernel Density Estimation

Global Optimization of Stochastic Black-Box Functions with Arbitrary Noise Distributions using Wilson Score Kernel Density Estimation arXiv:2509.09238v1 Announce Type: new Abstract: Many optimization problems in robotics involve the optimization of time-expensive black-box functions, such as those involving complex simulations or evaluation of real-world experiments. Furthermore, these functions are often stochastic as repeated experiments are subject…

September 12, 2025
Scalable extensions to given-data Sobol’ index estimators

Scalable extensions to given-data Sobol’ index estimators arXiv:2509.09078v1 Announce Type: new Abstract: Given-data methods for variance-based sensitivity analysis have significantly advanced the feasibility of Sobol’ index computation for computationally expensive models and models with many inputs. However, the limitations of existing methods still preclude their application to models with an extremely large number of inputs.…

September 12, 2025
Low-degree lower bounds via almost orthonormal bases

Low-degree lower bounds via almost orthonormal bases arXiv:2509.09353v1 Announce Type: new Abstract: Low-degree polynomials have emerged as a powerful paradigm for providing evidence of statistical-computational gaps across a variety of high-dimensional statistical models [Wein25]. For detection problems — where the goal is to test a planted distribution $mathbb{P}’$ against a null distribution $mathbb{P}$ with independent…

September 12, 2025
Uncertainty Estimation using Variance-Gated Distributions

Uncertainty Estimation using Variance-Gated Distributions arXiv:2509.08846v1 Announce Type: cross Abstract: Evaluation of per-sample uncertainty quantification from neural networks is essential for decision-making involving high-risk applications. A common approach is to use the predictive distribution from Bayesian or approximation models and decompose the corresponding predictive uncertainty into epistemic (model-related) and aleatoric (data-related) components. However, additive decomposition…

September 12, 2025
Instance-Optimal Matrix Multiplicative Weight Update and Its Quantum Applications

Instance-Optimal Matrix Multiplicative Weight Update and Its Quantum Applications arXiv:2509.08911v1 Announce Type: cross Abstract: The Matrix Multiplicative Weight Update (MMWU) is a seminal online learning algorithm with numerous applications. Applied to the matrix version of the Learning from Expert Advice (LEA) problem on the $d$-dimensional spectraplex, it is well known that MMWU achieves the minimax-optimal…

September 12, 2025