Category: stat.ML

An Exponential Averaging Process with Strong Convergence Properties

An Exponential Averaging Process with Strong Convergence Properties arXiv:2505.10605v1 Announce Type: new Abstract: Averaging, or smoothing, is a fundamental approach to obtain stable, de-noised estimates from noisy observations. In certain scenarios, observations made along trajectories of random dynamical systems are of particular interest. One popular smoothing technique for such a scenario is exponential moving averaging…

May 19, 2025
Minimax learning rates for estimating binary classifiers under margin conditions

Minimax learning rates for estimating binary classifiers under margin conditions arXiv:2505.10628v1 Announce Type: new Abstract: We study classification problems using binary estimators where the decision boundary is described by horizon functions and where the data distribution satisfies a geometric margin condition. We establish upper and lower bounds for the minimax learning rate over broad function…

May 19, 2025
Inexact Column Generation for Bayesian Network Structure Learning via Difference-of-Submodular Optimization

Inexact Column Generation for Bayesian Network Structure Learning via Difference-of-Submodular Optimization arXiv:2505.11089v1 Announce Type: new Abstract: In this paper, we consider a score-based Integer Programming (IP) approach for solving the Bayesian Network Structure Learning (BNSL) problem. State-of-the-art BNSL IP formulations suffer from the exponentially large number of variables and constraints. A standard approach in IP…

May 19, 2025
Supervised Models Can Generalize Also When Trained on Random Label

Supervised Models Can Generalize Also When Trained on Random Label arXiv:2505.11006v1 Announce Type: new Abstract: The success of unsupervised learning raises the question of whether also supervised models can be trained without using the information in the output $y$. In this paper, we demonstrate that this is indeed possible. The key step is to formulate…

May 19, 2025
Nash: Neural Adaptive Shrinkage for Structured High-Dimensional Regression

Nash: Neural Adaptive Shrinkage for Structured High-Dimensional Regression arXiv:2505.11143v1 Announce Type: new Abstract: Sparse linear regression is a fundamental tool in data analysis. However, traditional approaches often fall short when covariates exhibit structure or arise from heterogeneous sources. In biomedical applications, covariates may stem from distinct modalities or be structured according to an underlying graph.…

May 19, 2025
On Measuring Intrinsic Causal Attributions in Deep Neural Networks

On Measuring Intrinsic Causal Attributions in Deep Neural Networks arXiv:2505.09660v1 Announce Type: new Abstract: Quantifying the causal influence of input features within neural networks has become a topic of increasing interest. Existing approaches typically assess direct, indirect, and total causal effects. This work treats NNs as structural causal models (SCMs) and extends our focus to…

May 16, 2025
LatticeVision: Image to Image Networks for Modeling Non-Stationary Spatial Data

LatticeVision: Image to Image Networks for Modeling Non-Stationary Spatial Data arXiv:2505.09803v1 Announce Type: new Abstract: In many scientific and industrial applications, we are given a handful of instances (a ‘small ensemble’) of a spatially distributed quantity (a ‘field’) but would like to acquire many more. For example, a large ensemble of global temperature sensitivity fields…

May 16, 2025
Learning Multi-Attribute Differential Graphs with Non-Convex Penalties

Learning Multi-Attribute Differential Graphs with Non-Convex Penalties arXiv:2505.09748v1 Announce Type: new Abstract: We consider the problem of estimating differences in two multi-attribute Gaussian graphical models (GGMs) which are known to have similar structure, using a penalized D-trace loss function with non-convex penalties. The GGM structure is encoded in its precision (inverse covariance) matrix. Existing methods…

May 16, 2025
A Scalable Gradient-Based Optimization Framework for Sparse Minimum-Variance Portfolio Selection

A Scalable Gradient-Based Optimization Framework for Sparse Minimum-Variance Portfolio Selection arXiv:2505.10099v1 Announce Type: new Abstract: Portfolio optimization involves selecting asset weights to minimize a risk-reward objective, such as the portfolio variance in the classical minimum-variance framework. Sparse portfolio selection extends this by imposing a cardinality constraint: only $k$ assets from a universe of $p$ may…

May 16, 2025
Path Gradients after Flow Matching

Path Gradients after Flow Matching arXiv:2505.10139v1 Announce Type: new Abstract: Boltzmann Generators have emerged as a promising machine learning tool for generating samples from equilibrium distributions of molecular systems using Normalizing Flows and importance weighting. Recently, Flow Matching has helped speed up Continuous Normalizing Flows (CNFs), scale them to more complex molecular systems, and minimize…

May 16, 2025
Lower Bounds on the MMSE of Adversarially Inferring Sensitive Features

Lower Bounds on the MMSE of Adversarially Inferring Sensitive Features arXiv:2505.09004v1 Announce Type: new Abstract: We propose an adversarial evaluation framework for sensitive feature inference based on minimum mean-squared error (MMSE) estimation with a finite sample size and linear predictive models. Our approach establishes theoretical lower bounds on the true MMSE of inferring sensitive features…

May 15, 2025
Online Learning of Neural Networks

Online Learning of Neural Networks arXiv:2505.09167v1 Announce Type: new Abstract: We study online learning of feedforward neural networks with the sign activation function that implement functions from the unit ball in $mathbb{R}^d$ to a finite label set ${1, ldots, Y}$. First, we characterize a margin condition that is sufficient and in some cases necessary for…

May 15, 2025
Risk Bounds For Distributional Regression

Risk Bounds For Distributional Regression arXiv:2505.09075v1 Announce Type: new Abstract: This work examines risk bounds for nonparametric distributional regression estimators. For convex-constrained distributional regression, general upper bounds are established for the continuous ranked probability score (CRPS) and the worst-case mean squared error (MSE) across the domain. These theoretical results are applied to isotonic and trend…

May 15, 2025
Optimal Transport-Based Domain Adaptation for Rotated Linear Regression

Optimal Transport-Based Domain Adaptation for Rotated Linear Regression arXiv:2505.09229v1 Announce Type: new Abstract: Optimal Transport (OT) has proven effective for domain adaptation (DA) by aligning distributions across domains with differing statistical properties. Building on the approach of Courty et al. (2016), who mapped source data to the target domain for improved model transfer, we focus…

May 15, 2025
Fairness-aware Bayes optimal functional classification

Fairness-aware Bayes optimal functional classification arXiv:2505.09471v1 Announce Type: new Abstract: Algorithmic fairness has become a central topic in machine learning, and mitigating disparities across different subpopulations has emerged as a rapidly growing research area. In this paper, we systematically study the classification of functional data under fairness constraints, ensuring the disparity level of the classifier…

May 15, 2025
Wasserstein Distributionally Robust Nonparametric Regression

Wasserstein Distributionally Robust Nonparametric Regression arXiv:2505.07967v1 Announce Type: new Abstract: Distributionally robust optimization has become a powerful tool for prediction and decision-making under model uncertainty. By focusing on the local worst-case risk, it enhances robustness by identifying the most unfavorable distribution within a predefined ambiguity set. While extensive research has been conducted in parametric settings,…

May 14, 2025
Diffusion-based supervised learning of generative models for efficient sampling of multimodal distributions

Diffusion-based supervised learning of generative models for efficient sampling of multimodal distributions arXiv:2505.07825v1 Announce Type: new Abstract: We propose a hybrid generative model for efficient sampling of high-dimensional, multimodal probability distributions for Bayesian inference. Traditional Monte Carlo methods, such as the Metropolis-Hastings and Langevin Monte Carlo sampling methods, are effective for sampling from single-mode distributions…

May 14, 2025
Sharp Gaussian approximations for Decentralized Federated Learning

Sharp Gaussian approximations for Decentralized Federated Learning arXiv:2505.08125v1 Announce Type: new Abstract: Federated Learning has gained traction in privacy-sensitive collaborative environments, with local SGD emerging as a key optimization method in decentralized settings. While its convergence properties are well-studied, asymptotic statistical guarantees beyond convergence remain limited. In this paper, we present two generalized Gaussian approximation…

May 14, 2025
SIM-Shapley: A Stable and Computationally Efficient Approach to Shapley Value Approximation

SIM-Shapley: A Stable and Computationally Efficient Approach to Shapley Value Approximation arXiv:2505.08198v1 Announce Type: new Abstract: Explainable artificial intelligence (XAI) is essential for trustworthy machine learning (ML), particularly in high-stakes domains such as healthcare and finance. Shapley value (SV) methods provide a principled framework for feature attribution in complex models but incur high computational costs,…

May 14, 2025
Lie Group Symmetry Discovery and Enforcement Using Vector Fields

Lie Group Symmetry Discovery and Enforcement Using Vector Fields arXiv:2505.08219v1 Announce Type: new Abstract: Symmetry-informed machine learning can exhibit advantages over machine learning which fails to account for symmetry. Additionally, recent attention has been given to continuous symmetry discovery using vector fields which serve as infinitesimal generators for Lie group symmetries. In this paper, we…

May 14, 2025
Fair Representation Learning for Continuous Sensitive Attributes using Expectation of Integral Probability Metrics

Fair Representation Learning for Continuous Sensitive Attributes using Expectation of Integral Probability Metrics arXiv:2505.06435v1 Announce Type: new Abstract: AI fairness, also known as algorithmic fairness, aims to ensure that algorithms operate without bias or discrimination towards any individual or group. Among various AI algorithms, the Fair Representation Learning (FRL) approach has gained significant interest in…

May 13, 2025
High-Dimensional Importance-Weighted Information Criteria: Theory and Optimality

High-Dimensional Importance-Weighted Information Criteria: Theory and Optimality arXiv:2505.06531v1 Announce Type: new Abstract: Imori and Ing (2025) proposed the importance-weighted orthogonal greedy algorithm (IWOGA) for model selection in high-dimensional misspecified regression models under covariate shift. To determine the number of IWOGA iterations, they introduced the high-dimensional importance-weighted information criterion (HDIWIC). They argued that the combined use…

May 13, 2025
Optimal Transport for Machine Learners

Optimal Transport for Machine Learners arXiv:2505.06589v1 Announce Type: new Abstract: Optimal Transport is a foundational mathematical theory that connects optimization, partial differential equations, and probability. It offers a powerful framework for comparing probability distributions and has recently become an important tool in machine learning, especially for designing and evaluating generative models. These course notes cover…

May 13, 2025
Learning Guarantee of Reward Modeling Using Deep Neural Networks

Learning Guarantee of Reward Modeling Using Deep Neural Networks arXiv:2505.06601v1 Announce Type: new Abstract: In this work, we study the learning theory of reward modeling with pairwise comparison data using deep neural networks. We establish a novel non-asymptotic regret bound for deep reward estimators in a non-parametric setting, which depends explicitly on the network architecture.…

May 13, 2025
Feature Representation Transferring to Lightweight Models via Perception Coherence

Feature Representation Transferring to Lightweight Models via Perception Coherence arXiv:2505.06595v1 Announce Type: new Abstract: In this paper, we propose a method for transferring feature representation to lightweight student models from larger teacher models. We mathematically define a new notion called textit{perception coherence}. Based on this notion, we propose a loss function, which takes into account…

May 13, 2025
Optimal Regret of Bernoulli Bandits under Global Differential Privacy

Optimal Regret of Bernoulli Bandits under Global Differential Privacy arXiv:2505.05613v1 Announce Type: new Abstract: As sequential learning algorithms are increasingly applied to real life, ensuring data privacy while maintaining their utilities emerges as a timely question. In this context, regret minimisation in stochastic bandits under $epsilon$-global Differential Privacy (DP) has been widely studied. Unlike bandits…

May 12, 2025
An Efficient Transport-Based Dissimilarity Measure for Time Series Classification under Warping Distortions

An Efficient Transport-Based Dissimilarity Measure for Time Series Classification under Warping Distortions arXiv:2505.05676v1 Announce Type: cross Abstract: Time Series Classification (TSC) is an important problem with numerous applications in science and technology. Dissimilarity-based approaches, such as Dynamic Time Warping (DTW), are classical methods for distinguishing time series when time deformations are confounding information. In this…

May 12, 2025
DaringFed: A Dynamic Bayesian Persuasion Pricing for Online Federated Learning under Two-sided Incomplete Information

DaringFed: A Dynamic Bayesian Persuasion Pricing for Online Federated Learning under Two-sided Incomplete Information arXiv:2505.05842v1 Announce Type: cross Abstract: Online Federated Learning (OFL) is a real-time learning paradigm that sequentially executes parameter aggregation immediately for each random arriving client. To motivate clients to participate in OFL, it is crucial to offer appropriate incentives to offset…

May 12, 2025
Safe-EF: Error Feedback for Nonsmooth Constrained Optimization

Safe-EF: Error Feedback for Nonsmooth Constrained Optimization arXiv:2505.06053v1 Announce Type: cross Abstract: Federated learning faces severe communication bottlenecks due to the high dimensionality of model updates. Communication compression with contractive compressors (e.g., Top-K) is often preferable in practice but can degrade performance without proper handling. Error feedback (EF) mitigates such issues but has been largely…

May 12, 2025
Mixed-Integer Optimization for Responsible Machine Learning

Mixed-Integer Optimization for Responsible Machine Learning arXiv:2505.05857v1 Announce Type: cross Abstract: In the last few decades, Machine Learning (ML) has achieved significant success across domains ranging from healthcare, sustainability, and the social sciences, to criminal justice and finance. But its deployment in increasingly sophisticated, critical, and sensitive areas affecting individuals, the groups they belong to,…

May 12, 2025
Generalization Analysis for Contrastive Representation Learning under Non-IID Settings

Generalization Analysis for Contrastive Representation Learning under Non-IID Settings arXiv:2505.04937v1 Announce Type: new Abstract: Contrastive Representation Learning (CRL) has achieved impressive success in various domains in recent years. Nevertheless, the theoretical understanding of the generalization behavior of CRL is limited. Moreover, to the best of our knowledge, the current literature only analyzes generalization bounds under…

May 9, 2025
Learning Linearized Models from Nonlinear Systems under Initialization Constraints with Finite Data

Learning Linearized Models from Nonlinear Systems under Initialization Constraints with Finite Data arXiv:2505.04954v1 Announce Type: new Abstract: The identification of a linear system model from data has wide applications in control theory. The existing work that provides finite sample guarantees for linear system identification typically uses data from a single long system trajectory under i.i.d.…

May 9, 2025
Conformal Prediction with Cellwise Outliers: A Detect-then-Impute Approach

Conformal Prediction with Cellwise Outliers: A Detect-then-Impute Approach arXiv:2505.04986v1 Announce Type: new Abstract: Conformal prediction is a powerful tool for constructing prediction intervals for black-box models, providing a finite sample coverage guarantee for exchangeable data. However, this exchangeability is compromised when some entries of the test feature are contaminated, such as in the case of…

May 9, 2025
A Two-Sample Test of Text Generation Similarity

A Two-Sample Test of Text Generation Similarity arXiv:2505.05269v1 Announce Type: new Abstract: The surge in digitized text data requires reliable inferential methods on observed textual patterns. This article proposes a novel two-sample text test for comparing similarity between two groups of documents. The hypothesis is whether the probabilistic mapping generating the textual data is identical…

May 9, 2025
Boosting Statistic Learning with Synthetic Data from Pretrained Large Models

Boosting Statistic Learning with Synthetic Data from Pretrained Large Models arXiv:2505.04992v1 Announce Type: new Abstract: The rapid advancement of generative models, such as Stable Diffusion, raises a key question: how can synthetic data from these models enhance predictive modeling? While they can generate vast amounts of datasets, only a subset meaningfully improves performance. We propose…

May 9, 2025
Categorical and geometric methods in statistical, manifold, and machine learning

Categorical and geometric methods in statistical, manifold, and machine learning arXiv:2505.03862v1 Announce Type: new Abstract: We present and discuss applications of the category of probabilistic morphisms, initially developed in cite{Le2023}, as well as some geometric methods to several classes of problems in statistical, machine and manifold learning which shall be, along with many other topics,…

May 8, 2025
Cer-Eval: Certifiable and Cost-Efficient Evaluation Framework for LLMs

Cer-Eval: Certifiable and Cost-Efficient Evaluation Framework for LLMs arXiv:2505.03814v1 Announce Type: new Abstract: As foundation models continue to scale, the size of trained models grows exponentially, presenting significant challenges for their evaluation. Current evaluation practices involve curating increasingly large datasets to assess the performance of large language models (LLMs). However, there is a lack of…

May 8, 2025
Variational Formulation of the Particle Flow Particle Filter

Variational Formulation of the Particle Flow Particle Filter arXiv:2505.04007v1 Announce Type: new Abstract: This paper provides a formulation of the particle flow particle filter from the perspective of variational inference. We show that the transient density used to derive the particle flow particle filter follows a time-scaled trajectory of the Fisher-Rao gradient flow in the…

May 8, 2025
A Tutorial on Discriminative Clustering and Mutual Information

A Tutorial on Discriminative Clustering and Mutual Information arXiv:2505.04484v1 Announce Type: new Abstract: To cluster data is to separate samples into distinctive groups that should ideally have some cohesive properties. Today, numerous clustering algorithms exist, and their differences lie essentially in what can be perceived as “cohesive properties”. Therefore, hypotheses on the nature of clusters…

May 8, 2025
From Two Sample Testing to Singular Gaussian Discrimination

From Two Sample Testing to Singular Gaussian Discrimination arXiv:2505.04613v1 Announce Type: new Abstract: We establish that testing for the equality of two probability measures on a general separable and compact metric space is equivalent to testing for the singularity between two corresponding Gaussian measures on a suitable Reproducing Kernel Hilbert Space. The corresponding Gaussians are…

May 8, 2025
GeoERM: Geometry-Aware Multi-Task Representation Learning on Riemannian Manifolds

GeoERM: Geometry-Aware Multi-Task Representation Learning on Riemannian Manifolds arXiv:2505.02972v1 Announce Type: new Abstract: Multi-Task Learning (MTL) seeks to boost statistical power and learning efficiency by discovering structure shared across related tasks. State-of-the-art MTL representation methods, however, usually treat the latent representation matrix as a point in ordinary Euclidean space, ignoring its often non-Euclidean geometry, thus…

May 7, 2025
Modeling Spatial Extremes using Non-Gaussian Spatial Autoregressive Models via Convolutional Neural Networks

Modeling Spatial Extremes using Non-Gaussian Spatial Autoregressive Models via Convolutional Neural Networks arXiv:2505.03034v1 Announce Type: new Abstract: Data derived from remote sensing or numerical simulations often have a regular gridded structure and are large in volume, making it challenging to find accurate spatial models that can fill in missing grid cells or simulate the process…

May 7, 2025
A Symbolic and Statistical Learning Framework to Discover Bioprocessing Regulatory Mechanism: Cell Culture Example

A Symbolic and Statistical Learning Framework to Discover Bioprocessing Regulatory Mechanism: Cell Culture Example arXiv:2505.03177v1 Announce Type: new Abstract: Bioprocess mechanistic modeling is essential for advancing intelligent digital twin representation of biomanufacturing, yet challenges persist due to complex intracellular regulation, stochastic system behavior, and limited experimental data. This paper introduces a symbolic and statistical learning…

May 7, 2025
Weighted Average Gradients for Feature Attribution

Weighted Average Gradients for Feature Attribution arXiv:2505.03201v1 Announce Type: new Abstract: In explainable AI, Integrated Gradients (IG) is a widely adopted technique for assessing the significance of feature attributes of the input on model outputs by evaluating contributions from a baseline input to the current input. The choice of the baseline input significantly influences the…

May 7, 2025
Lower Bounds for Greedy Teaching Set Constructions

Lower Bounds for Greedy Teaching Set Constructions arXiv:2505.03223v1 Announce Type: new Abstract: A fundamental open problem in learning theory is to characterize the best-case teaching dimension $operatorname{TS}_{min}$ of a concept class $mathcal{C}$ with finite VC dimension $d$. Resolving this problem will, in particular, settle the conjectured upper bound on Recursive Teaching Dimension posed by [Simon…

May 7, 2025
TV-SurvCaus: Dynamic Representation Balancing for Causal Survival Analysis

TV-SurvCaus: Dynamic Representation Balancing for Causal Survival Analysis arXiv:2505.01785v1 Announce Type: new Abstract: Estimating the causal effect of time-varying treatments on survival outcomes is a challenging task in many domains, particularly in medicine where treatment protocols adapt over time. While recent advances in representation learning have improved causal inference for static treatments, extending these methods…

May 6, 2025
Fast Likelihood-Free Parameter Estimation for L’evy Processes

Fast Likelihood-Free Parameter Estimation for L’evy Processes arXiv:2505.01639v1 Announce Type: new Abstract: L’evy processes are widely used in financial modeling due to their ability to capture discontinuities and heavy tails, which are common in high-frequency asset return data. However, parameter estimation remains a challenge when associated likelihoods are unavailable or costly to compute. We propose…

May 6, 2025
Bayesian learning of the optimal action-value function in a Markov decision process

Bayesian learning of the optimal action-value function in a Markov decision process arXiv:2505.01859v1 Announce Type: new Abstract: The Markov Decision Process (MDP) is a popular framework for sequential decision-making problems, and uncertainty quantification is an essential component of it to learn optimal decision-making strategies. In particular, a Bayesian framework is used to maintain beliefs about…

May 6, 2025
Extended Fiducial Inference for Individual Treatment Effects via Deep Neural Networks

Extended Fiducial Inference for Individual Treatment Effects via Deep Neural Networks arXiv:2505.01995v1 Announce Type: new Abstract: Individual treatment effect estimation has gained significant attention in recent data science literature. This work introduces the Double Neural Network (Double-NN) method to address this problem within the framework of extended fiducial inference (EFI). In the proposed method, deep…

May 6, 2025
Learning the Simplest Neural ODE

Learning the Simplest Neural ODE arXiv:2505.02019v1 Announce Type: new Abstract: Since the advent of the “Neural Ordinary Differential Equation (Neural ODE)” paper, learning ODEs with deep learning has been applied to system identification, time-series forecasting, and related areas. Exploiting the diffeomorphic nature of ODE solution maps, neural ODEs has also enabled their use in generative…

May 6, 2025
On the emergence of numerical instabilities in Next Generation Reservoir Computing

On the emergence of numerical instabilities in Next Generation Reservoir Computing arXiv:2505.00846v1 Announce Type: new Abstract: Next Generation Reservoir Computing (NGRC) is a low-cost machine learning method for forecasting chaotic time series from data. However, ensuring the dynamical stability of NGRC models during autonomous prediction remains a challenge. In this work, we uncover a key…

May 5, 2025
DOLCE: Decomposing Off-Policy Evaluation/Learning into Lagged and Current Effects

DOLCE: Decomposing Off-Policy Evaluation/Learning into Lagged and Current Effects arXiv:2505.00961v1 Announce Type: new Abstract: Off-policy evaluation (OPE) and off-policy learning (OPL) for contextual bandit policies leverage historical data to evaluate and optimize a target policy. Most existing OPE/OPL methods–based on importance weighting or imputation–assume common support between the target and logging policies. When this assumption…

May 5, 2025
Gaussian Differential Private Bootstrap by Subsampling

Gaussian Differential Private Bootstrap by Subsampling arXiv:2505.01197v1 Announce Type: new Abstract: Bootstrap is a common tool for quantifying uncertainty in data analysis. However, besides additional computational costs in the application of the bootstrap on massive data, a challenging problem in bootstrap based inference under Differential Privacy consists in the fact that it requires repeated access…

May 5, 2025
Characterization and Learning of Causal Graphs from Hard Interventions

Characterization and Learning of Causal Graphs from Hard Interventions arXiv:2505.01037v1 Announce Type: new Abstract: A fundamental challenge in the empirical sciences involves uncovering causal structure through observation and experimentation. Causal discovery entails linking the conditional independence (CI) invariances in observational data to their corresponding graphical constraints via d-separation. In this paper, we consider a general…

May 5, 2025
Provable Efficiency of Guidance in Diffusion Models for General Data Distribution

Provable Efficiency of Guidance in Diffusion Models for General Data Distribution arXiv:2505.01382v1 Announce Type: new Abstract: Diffusion models have emerged as a powerful framework for generative modeling, with guidance techniques playing a crucial role in enhancing sample quality. Despite their empirical success, a comprehensive theoretical understanding of the guidance effect remains limited. Existing studies only…

May 5, 2025
Inference for max-linear Bayesian networks with noise

Inference for max-linear Bayesian networks with noise arXiv:2505.00229v1 Announce Type: new Abstract: Max-Linear Bayesian Networks (MLBNs) provide a powerful framework for causal inference in extreme-value settings; we consider MLBNs with noise parameters with a given topology in terms of the max-plus algebra by taking its logarithm. Then, we show that an estimator of a parameter…

May 2, 2025
On the expressivity of deep Heaviside networks

On the expressivity of deep Heaviside networks arXiv:2505.00110v1 Announce Type: new Abstract: We show that deep Heaviside networks (DHNs) have limited expressiveness but that this can be overcome by including either skip connections or neurons with linear activation. We provide lower and upper bounds for the Vapnik-Chervonenkis (VC) dimensions and approximation rates of these network…

May 2, 2025
Reinforcement Learning with Continuous Actions Under Unmeasured Confounding

Reinforcement Learning with Continuous Actions Under Unmeasured Confounding arXiv:2505.00304v1 Announce Type: new Abstract: This paper addresses the challenge of offline policy learning in reinforcement learning with continuous action spaces when unmeasured confounders are present. While most existing research focuses on policy evaluation within partially observable Markov decision processes (POMDPs) and assumes discrete action spaces, we…

May 2, 2025
Statistical Learning for Heterogeneous Treatment Effects: Pretraining, Prognosis, and Prediction

Statistical Learning for Heterogeneous Treatment Effects: Pretraining, Prognosis, and Prediction arXiv:2505.00310v1 Announce Type: new Abstract: Robust estimation of heterogeneous treatment effects is a fundamental challenge for optimal decision-making in domains ranging from personalized medicine to educational policy. In recent years, predictive machine learning has emerged as a valuable toolbox for causal estimation, enabling more flexible…

May 2, 2025
Hypothesis-free discovery from epidemiological data by automatic detection and local inference for tree-based nonlinearities and interactions

Hypothesis-free discovery from epidemiological data by automatic detection and local inference for tree-based nonlinearities and interactions arXiv:2505.00571v1 Announce Type: new Abstract: In epidemiological settings, Machine Learning (ML) is gaining popularity for hypothesis-free discovery of risk (or protective) factors. Although ML is strong at discovering non-linearities and interactions, this power is currently compromised by a lack…

May 2, 2025
Kernel Density Machines

Kernel Density Machines arXiv:2504.21419v1 Announce Type: new Abstract: We introduce kernel density machines (KDM), a novel density ratio estimator in a reproducing kernel Hilbert space setting. KDM applies to general probability measures on countably generated measurable spaces without restrictive assumptions on continuity, or the existence of a Lebesgue density. For computational efficiency, we incorporate a…

May 1, 2025
Generate-then-Verify: Reconstructing Data from Limited Published Statistics

Generate-then-Verify: Reconstructing Data from Limited Published Statistics arXiv:2504.21199v1 Announce Type: new Abstract: We study the problem of reconstructing tabular data from aggregate statistics, in which the attacker aims to identify interesting claims about the sensitive data that can be verified with 100% certainty given the aggregates. Successful attempts in prior work have conducted studies in…

May 1, 2025
Wasserstein-Aitchison GAN for angular measures of multivariate extremes

Wasserstein-Aitchison GAN for angular measures of multivariate extremes arXiv:2504.21438v1 Announce Type: new Abstract: Economically responsible mitigation of multivariate extreme risks — extreme rainfall in a large area, huge variations of many stock prices, widespread breakdowns in transportation systems — requires estimates of the probabilities that such risks will materialize in the future. This paper develops…

May 1, 2025
A comparison of generative deep learning methods for multivariate angular simulation

A comparison of generative deep learning methods for multivariate angular simulation arXiv:2504.21505v1 Announce Type: new Abstract: With the recent development of new geometric and angular-radial frameworks for multivariate extremes, reliably simulating from angular variables in moderate-to-high dimensions is of increasing importance. Empirical approaches have the benefit of simplicity, and work reasonably well in low dimensions,…

May 1, 2025
Balancing Interpretability and Flexibility in Modeling Diagnostic Trajectories with an Embedded Neural Hawkes Process Model

Balancing Interpretability and Flexibility in Modeling Diagnostic Trajectories with an Embedded Neural Hawkes Process Model arXiv:2504.21795v1 Announce Type: new Abstract: The Hawkes process (HP) is commonly used to model event sequences with self-reinforcing dynamics, including electronic health records (EHRs). Traditional HPs capture self-reinforcement via parametric impact functions that can be inspected to understand how each…

May 1, 2025
Coreset selection for the Sinkhorn divergence and generic smooth divergences

Coreset selection for the Sinkhorn divergence and generic smooth divergences arXiv:2504.20194v1 Announce Type: new Abstract: We introduce CO2, an efficient algorithm to produce convexly-weighted coresets with respect to generic smooth divergences. By employing a functional Taylor expansion, we show a local equivalence between sufficiently regular losses and their second order approximations, reducing the coreset selection…

April 30, 2025
Learning and Generalization with Mixture Data

Learning and Generalization with Mixture Data arXiv:2504.20651v1 Announce Type: new Abstract: In many, if not most, machine learning applications the training data is naturally heterogeneous (e.g. federated learning, adversarial attacks and domain adaptation in neural net training). Data heterogeneity is identified as one of the major challenges in modern day large-scale learning. A classical way…

April 30, 2025
Sobolev norm inconsistency of kernel interpolation

Sobolev norm inconsistency of kernel interpolation arXiv:2504.20617v1 Announce Type: new Abstract: We study the consistency of minimum-norm interpolation in reproducing kernel Hilbert spaces corresponding to bounded kernels. Our main result give lower bounds for the generalization error of the kernel interpolation measured in a continuous scale of norms that interpolate between $L^2$ and the hypothesis…

April 30, 2025
Preference-centric Bandits: Optimality of Mixtures and Regret-efficient Algorithms

Preference-centric Bandits: Optimality of Mixtures and Regret-efficient Algorithms arXiv:2504.20877v1 Announce Type: new Abstract: The objective of canonical multi-armed bandits is to identify and repeatedly select an arm with the largest reward, often in the form of the expected value of the arm’s probability distribution. Such a utilitarian perspective and focus on the probability models’ first…

April 30, 2025
Decoding Latent Spaces: Assessing the Interpretability of Time Series Foundation Models for Visual Analytics

Decoding Latent Spaces: Assessing the Interpretability of Time Series Foundation Models for Visual Analytics arXiv:2504.20099v1 Announce Type: cross Abstract: The present study explores the interpretability of latent spaces produced by time series foundation models, focusing on their potential for visual analysis tasks. Specifically, we evaluate the MOMENT family of models, a set of transformer-based, pre-trained…

April 30, 2025
Statistical Inference for Clustering-based Anomaly Detection

Statistical Inference for Clustering-based Anomaly Detection arXiv:2504.18633v1 Announce Type: new Abstract: Unsupervised anomaly detection (AD) is a fundamental problem in machine learning and statistics. A popular approach to unsupervised AD is clustering-based detection. However, this method lacks the ability to guarantee the reliability of the detected anomalies. In this paper, we propose SI-CLAD (Statistical Inference…

April 29, 2025
Local Polynomial Lp-norm Regression

Local Polynomial Lp-norm Regression arXiv:2504.18695v1 Announce Type: new Abstract: The local least squares estimator for a regression curve cannot provide optimal results when non-Gaussian noise is present. Both theoretical and empirical evidence suggests that residuals often exhibit distributional properties different from those of a normal distribution, making it worthwhile to consider estimation based on other…

April 29, 2025
Foundations of Safe Online Reinforcement Learning in the Linear Quadratic Regulator: $sqrt{T}$-Regret

Foundations of Safe Online Reinforcement Learning in the Linear Quadratic Regulator: $sqrt{T}$-Regret arXiv:2504.18657v1 Announce Type: new Abstract: Understanding how to efficiently learn while adhering to safety constraints is essential for using online reinforcement learning in practical applications. However, proving rigorous regret bounds for safety-constrained reinforcement learning is difficult due to the complex interaction between safety,…

April 29, 2025
A Dictionary of Closed-Form Kernel Mean Embeddings

A Dictionary of Closed-Form Kernel Mean Embeddings arXiv:2504.18830v1 Announce Type: new Abstract: Kernel mean embeddings — integrals of a kernel with respect to a probability distribution — are essential in Bayesian quadrature, but also widely used in other computational tools for numerical integration or for statistical inference based on the maximum mean discrepancy. These methods…

April 29, 2025
ReLU integral probability metric and its applications

ReLU integral probability metric and its applications arXiv:2504.18897v1 Announce Type: new Abstract: We propose a parametric integral probability metric (IPM) to measure the discrepancy between two probability measures. The proposed IPM leverages a specific parametric family of discriminators, such as single-node neural networks with ReLU activation, to effectively distinguish between distributions, making it applicable in…

April 29, 2025
Learning Operators by Regularized Stochastic Gradient Descent with Operator-valued Kernels

Learning Operators by Regularized Stochastic Gradient Descent with Operator-valued Kernels arXiv:2504.18184v1 Announce Type: new Abstract: This paper investigates regularized stochastic gradient descent (SGD) algorithms for estimating nonlinear operators from a Polish space to a separable Hilbert space. We assume that the regression operator lies in a vector-valued reproducing kernel Hilbert space induced by an operator-valued…

April 28, 2025
Learning Enhanced Ensemble Filters

Learning Enhanced Ensemble Filters arXiv:2504.17836v1 Announce Type: new Abstract: The filtering distribution in hidden Markov models evolves according to the law of a mean-field model in state–observation space. The ensemble Kalman filter (EnKF) approximates this mean-field model with an ensemble of interacting particles, employing a Gaussian ansatz for the joint distribution of the state and…

April 28, 2025
Post-Transfer Learning Statistical Inference in High-Dimensional Regression

Post-Transfer Learning Statistical Inference in High-Dimensional Regression arXiv:2504.18212v1 Announce Type: new Abstract: Transfer learning (TL) for high-dimensional regression (HDR) is an important problem in machine learning, particularly when dealing with limited sample size in the target task. However, there currently lacks a method to quantify the statistical significance of the relationship between features and the…

April 28, 2025
Generalization Guarantees for Multi-View Representation Learning and Application to Regularization via Gaussian Product Mixture Prior

Generalization Guarantees for Multi-View Representation Learning and Application to Regularization via Gaussian Product Mixture Prior arXiv:2504.18455v1 Announce Type: new Abstract: We study the problem of distributed multi-view representation learning. In this problem, $K$ agents observe each one distinct, possibly statistically correlated, view and independently extracts from it a suitable representation in a manner that a…

April 28, 2025
Enhancing Visual Interpretability and Explainability in Functional Survival Trees and Forests

Enhancing Visual Interpretability and Explainability in Functional Survival Trees and Forests arXiv:2504.18498v1 Announce Type: new Abstract: Functional survival models are key tools for analyzing time-to-event data with complex predictors, such as functional or high-dimensional inputs. Despite their predictive strength, these models often lack interpretability, which limits their value in practical decision-making and risk analysis. This…

April 28, 2025
Physics-informed features in supervised machine learning

Physics-informed features in supervised machine learning arXiv:2504.17112v1 Announce Type: new Abstract: Supervised machine learning involves approximating an unknown functional relationship from a limited dataset of features and corresponding labels. The classical approach to feature-based machine learning typically relies on applying linear regression to standardized features, without considering their physical meaning. This may limit model explainability,…

April 25, 2025
Causal rule ensemble approach for multi-arm data

Causal rule ensemble approach for multi-arm data arXiv:2504.17166v1 Announce Type: new Abstract: Heterogeneous treatment effect (HTE) estimation is critical in medical research. It provides insights into how treatment effects vary among individuals, which can provide statistical evidence for precision medicine. While most existing methods focus on binary treatment situations, real-world applications often involve multiple interventions.…

April 25, 2025
Likelihood-Free Variational Autoencoders

Likelihood-Free Variational Autoencoders arXiv:2504.17622v1 Announce Type: new Abstract: Variational Autoencoders (VAEs) typically rely on a probabilistic decoder with a predefined likelihood, most commonly an isotropic Gaussian, to model the data conditional on latent variables. While convenient for optimization, this choice often leads to likelihood misspecification, resulting in blurry reconstructions and poor data fidelity, especially for…

April 25, 2025
Evaluating Uncertainty in Deep Gaussian Processes

Evaluating Uncertainty in Deep Gaussian Processes arXiv:2504.17719v1 Announce Type: new Abstract: Reliable uncertainty estimates are crucial in modern machine learning. Deep Gaussian Processes (DGPs) and Deep Sigma Point Processes (DSPPs) extend GPs hierarchically, offering promising methods for uncertainty quantification grounded in Bayesian principles. However, their empirical calibration and robustness under distribution shift relative to baselines…

April 25, 2025
(Im)possibility of Automated Hallucination Detection in Large Language Models

(Im)possibility of Automated Hallucination Detection in Large Language Models arXiv:2504.17004v1 Announce Type: cross Abstract: Is automated hallucination detection possible? In this work, we introduce a theoretical framework to analyze the feasibility of automatically detecting hallucinations produced by large language models (LLMs). Inspired by the classical Gold-Angluin framework for language identification and its recent adaptation to…

April 25, 2025
Covariate-dependent Graphical Model Estimation via Neural Networks with Statistical Guarantees

Covariate-dependent Graphical Model Estimation via Neural Networks with Statistical Guarantees arXiv:2504.16356v1 Announce Type: new Abstract: Graphical models are widely used in diverse application domains to model the conditional dependencies amongst a collection of random variables. In this paper, we consider settings where the graph structure is covariate-dependent, and investigate a deep neural network-based approach to…

April 24, 2025
Behavior of prediction performance metrics with rare events

Behavior of prediction performance metrics with rare events arXiv:2504.16185v1 Announce Type: new Abstract: Area under the receiving operator characteristic curve (AUC) is commonly reported alongside binary prediction models. However, there are concerns that AUC might be a misleading measure of prediction performance in the rare event setting. This setting is common since many events of…

April 24, 2025
Towards Accurate Forecasting of Renewable Energy : Building Datasets and Benchmarking Machine Learning Models for Solar and Wind Power in France

Towards Accurate Forecasting of Renewable Energy : Building Datasets and Benchmarking Machine Learning Models for Solar and Wind Power in France arXiv:2504.16100v1 Announce Type: cross Abstract: Accurate prediction of non-dispatchable renewable energy sources is essential for grid stability and price prediction. Regional power supply forecasts are usually indirect through a bottom-up approach of plant-level forecasts,…

April 24, 2025
Physics-Informed Inference Time Scaling via Simulation-Calibrated Scientific Machine Learning

Physics-Informed Inference Time Scaling via Simulation-Calibrated Scientific Machine Learning arXiv:2504.16172v1 Announce Type: cross Abstract: High-dimensional partial differential equations (PDEs) pose significant computational challenges across fields ranging from quantum chemistry to economics and finance. Although scientific machine learning (SciML) techniques offer approximate solutions, they often suffer from bias and neglect crucial physical insights. Inspired by inference-time…

April 24, 2025
Probabilistic Emulation of the Community Radiative Transfer Model Using Machine Learning

Probabilistic Emulation of the Community Radiative Transfer Model Using Machine Learning arXiv:2504.16192v1 Announce Type: cross Abstract: The continuous improvement in weather forecast skill over the past several decades is largely due to the increasing quantity of available satellite observations and their assimilation into operational forecast systems. Assimilating these observations requires observation operators in the form…

April 24, 2025
Transfer Learning for High-dimensional Reduced Rank Time Series Models

Transfer Learning for High-dimensional Reduced Rank Time Series Models arXiv:2504.15691v1 Announce Type: new Abstract: The objective of transfer learning is to enhance estimation and inference in a target data by leveraging knowledge gained from additional sources. Recent studies have explored transfer learning for independent observations in complex, high-dimensional models assuming sparsity, yet research on time…

April 23, 2025
From predictions to confidence intervals: an empirical study of conformal prediction methods for in-context learning

From predictions to confidence intervals: an empirical study of conformal prediction methods for in-context learning arXiv:2504.15722v1 Announce Type: new Abstract: Transformers have become a standard architecture in machine learning, demonstrating strong in-context learning (ICL) abilities that allow them to learn from the prompt at inference time. However, uncertainty quantification for ICL remains an open challenge,…

April 23, 2025
How Private is Your Attention? Bridging Privacy with In-Context Learning

How Private is Your Attention? Bridging Privacy with In-Context Learning arXiv:2504.16000v1 Announce Type: new Abstract: In-context learning (ICL)-the ability of transformer-based models to perform new tasks from examples provided at inference time-has emerged as a hallmark of modern language models. While recent works have investigated the mechanisms underlying ICL, its feasibility under formal privacy constraints…

April 23, 2025
Explainable Unsupervised Anomaly Detection with Random Forest

Explainable Unsupervised Anomaly Detection with Random Forest arXiv:2504.16075v1 Announce Type: new Abstract: We describe the use of an unsupervised Random Forest for similarity learning and improved unsupervised anomaly detection. By training a Random Forest to discriminate between real data and synthetic data sampled from a uniform distribution over the real data bounds, a distance measure…

April 23, 2025
Significativity Indices for Agreement Values

Significativity Indices for Agreement Values arXiv:2504.15325v1 Announce Type: cross Abstract: Agreement measures, such as Cohen’s kappa or intraclass correlation, gauge the matching between two or more classifiers. They are used in a wide range of contexts from medicine, where they evaluate the effectiveness of medical treatments and clinical trials, to artificial intelligence, where they can…

April 23, 2025
Learning over von Mises-Fisher Distributions via a Wasserstein-like Geometry

Learning over von Mises-Fisher Distributions via a Wasserstein-like Geometry arXiv:2504.14164v1 Announce Type: new Abstract: We introduce a novel, geometry-aware distance metric for the family of von Mises-Fisher (vMF) distributions, which are fundamental models for directional data on the unit hypersphere. Although the vMF distribution is widely employed in a variety of probabilistic learning tasks involving…

April 22, 2025
Optimal Scheduling of Dynamic Transport

Optimal Scheduling of Dynamic Transport arXiv:2504.14425v1 Announce Type: new Abstract: Flow-based methods for sampling and generative modeling use continuous-time dynamical systems to represent a {transport map} that pushes forward a source measure to a target measure. The introduction of a time axis provides considerable design freedom, and a central question is how to exploit this…

April 22, 2025
Expected Free Energy-based Planning as Variational Inference

Expected Free Energy-based Planning as Variational Inference arXiv:2504.14898v1 Announce Type: new Abstract: We address the problem of planning under uncertainty, where an agent must choose actions that not only achieve desired outcomes but also reduce uncertainty. Traditional methods often treat exploration and exploitation as separate objectives, lacking a unified inferential foundation. Active inference, grounded in…

April 22, 2025
On the Tunability of Random Survival Forests Model for Predictive Maintenance

On the Tunability of Random Survival Forests Model for Predictive Maintenance arXiv:2504.14744v1 Announce Type: new Abstract: This paper investigates the tunability of the Random Survival Forest (RSF) model in predictive maintenance, where accurate time-to-failure estimation is crucial. Although RSF is widely used due to its flexibility and ability to handle censored data, its performance is…

April 22, 2025
Advanced posterior analyses of hidden Markov models: finite Markov chain imbedding and hybrid decoding

Advanced posterior analyses of hidden Markov models: finite Markov chain imbedding and hybrid decoding arXiv:2504.15156v1 Announce Type: new Abstract: Two major tasks in applications of hidden Markov models are to (i) compute distributions of summary statistics of the hidden state sequence, and (ii) decode the hidden state sequence. We describe finite Markov chain imbedding (FMCI)…

April 22, 2025