Category: math.PR

Bayesian Modeling of Collatz Stopping Times: A Probabilistic Machine Learning Perspective

Bayesian Modeling of Collatz Stopping Times: A Probabilistic Machine Learning Perspective arXiv:2603.04479v1 Announce Type: new Abstract: We study the Collatz total stopping time $tau(n)$ over $nle 10^7$ from a probabilistic machine learning viewpoint. Empirically, $tau(n)$ is a skewed and heavily overdispersed count with pronounced arithmetic heterogeneity. We develop two complementary models. First, a Bayesian hierarchical…

March 6, 2026
The Partition Principle Revisited: Non-Equal Volume Designs Achieve Minimal Expected Star Discrepancy

The Partition Principle Revisited: Non-Equal Volume Designs Achieve Minimal Expected Star Discrepancy arXiv:2603.00202v1 Announce Type: new Abstract: We study the expected star discrepancy under a newly designed class of non-equal volume partitions. The main contributions are twofold. First, we establish a strong partition principle for the star discrepancy, showing that our newly designed non-equal volume…

March 3, 2026
Deep Neural Networks as Iterated Function Systems and a Generalization Bound

Deep Neural Networks as Iterated Function Systems and a Generalization Bound arXiv:2601.19958v1 Announce Type: new Abstract: Deep neural networks (DNNs) achieve remarkable performance on a wide range of tasks, yet their mathematical analysis remains fragmented: stability and generalization are typically studied in disparate frameworks and on a case-by-case basis. Architecturally, DNNs rely on the recursive…

January 29, 2026
Distributional Computational Graphs: Error Bounds

Distributional Computational Graphs: Error Bounds arXiv:2601.16250v1 Announce Type: new Abstract: We study a general framework of distributional computational graphs: computational graphs whose inputs are probability distributions rather than point values. We analyze the discretization error that arises when these graphs are evaluated using finite approximations of continuous probability distributions. Such an approximation might be the…

January 26, 2026
Parametric RDT approach to computational gap of symmetric binary perceptron

Parametric RDT approach to computational gap of symmetric binary perceptron arXiv:2601.10628v1 Announce Type: new Abstract: We study potential presence of statistical-computational gaps (SCG) in symmetric binary perceptrons (SBP) via a parametric utilization of emph{fully lifted random duality theory} (fl-RDT) [96]. A structural change from decreasingly to arbitrarily ordered $c$-sequence (a key fl-RDT parametric component) is…

January 16, 2026
Tail-Sensitive KL and R’enyi Convergence of Unadjusted Hamiltonian Monte Carlo via One-Shot Couplings

Tail-Sensitive KL and R’enyi Convergence of Unadjusted Hamiltonian Monte Carlo via One-Shot Couplings arXiv:2601.09019v1 Announce Type: new Abstract: Hamiltonian Monte Carlo (HMC) algorithms are among the most widely used sampling methods in high dimensional settings, yet their convergence properties are poorly understood in divergences that quantify relative density mismatch, such as Kullback-Leibler (KL) and R’enyi…

January 15, 2026
SCaLE: Switching Cost aware Learning and Exploration

SCaLE: Switching Cost aware Learning and Exploration arXiv:2601.09042v1 Announce Type: cross Abstract: This work addresses the fundamental problem of unbounded metric movement costs in bandit online convex optimization, by considering high-dimensional dynamic quadratic hitting costs and $ell_2$-norm switching costs in a noisy bandit feedback model. For a general class of stochastic environments, we provide the…

January 15, 2026
Constrained Density Estimation via Optimal Transport

Constrained Density Estimation via Optimal Transport arXiv:2601.06830v1 Announce Type: new Abstract: A novel framework for density estimation under expectation constraints is proposed. The framework minimizes the Wasserstein distance between the estimated density and a prior, subject to the constraints that the expected value of a set of functions adopts or exceeds given values. The framework…

January 13, 2026
Detecting Stochasticity in Discrete Signals via Nonparametric Excursion Theorem

Detecting Stochasticity in Discrete Signals via Nonparametric Excursion Theorem arXiv:2601.06009v1 Announce Type: new Abstract: We develop a practical framework for distinguishing diffusive stochastic processes from deterministic signals using only a single discrete time series. Our approach is based on classical excursion and crossing theorems for continuous semimartingales, which correlates number $N_varepsilon$ of excursions of magnitude…

January 12, 2026
Constructive Approximation of Random Process via Stochastic Interpolation Neural Network Operators

Constructive Approximation of Random Process via Stochastic Interpolation Neural Network Operators arXiv:2512.24106v1 Announce Type: new Abstract: In this paper, we construct a class of stochastic interpolation neural network operators (SINNOs) with random coefficients activated by sigmoidal functions. We establish their boundedness, interpolation accuracy, and approximation capabilities in the mean square sense, in probability, as well…

January 1, 2026
Learning from Neighbors with PHIBP: Predicting Infectious Disease Dynamics in Data-Sparse Environments

Learning from Neighbors with PHIBP: Predicting Infectious Disease Dynamics in Data-Sparse Environments arXiv:2512.21005v1 Announce Type: new Abstract: Modeling sparse count data, which arise across numerous scientific fields, presents significant statistical challenges. This chapter addresses these challenges in the context of infectious disease prediction, with a focus on predicting outbreaks in geographic regions that have historically…

December 25, 2025
Sampling from multimodal distributions with warm starts: Non-asymptotic bounds for the Reweighted Annealed Leap-Point Sampler

Sampling from multimodal distributions with warm starts: Non-asymptotic bounds for the Reweighted Annealed Leap-Point Sampler arXiv:2512.17977v1 Announce Type: new Abstract: Sampling from multimodal distributions is a central challenge in Bayesian inference and machine learning. In light of hardness results for sampling — classical MCMC methods, even with tempering, can suffer from exponential mixing times —…

December 23, 2025
On The Hidden Biases of Flow Matching Samplers

On The Hidden Biases of Flow Matching Samplers arXiv:2512.16768v1 Announce Type: new Abstract: We study the implicit bias of flow matching (FM) samplers via the lens of empirical flow matching. Although population FM may produce gradient-field velocities resembling optimal transport (OT), we show that the empirical FM minimizer is almost never a gradient field, even…

December 19, 2025
Error Analysis of Generalized Langevin Equations with Approximated Memory Kernels

Error Analysis of Generalized Langevin Equations with Approximated Memory Kernels arXiv:2512.10256v1 Announce Type: new Abstract: We analyze prediction error in stochastic dynamical systems with memory, focusing on generalized Langevin equations (GLEs) formulated as stochastic Volterra equations. We establish that, under a strongly convex potential, trajectory discrepancies decay at a rate determined by the decay of…

December 12, 2025
Provable Diffusion Posterior Sampling for Bayesian Inversion

Provable Diffusion Posterior Sampling for Bayesian Inversion arXiv:2512.08022v1 Announce Type: new Abstract: This paper proposes a novel diffusion-based posterior sampling method within a plug-and-play (PnP) framework. Our approach constructs a probability transport from an easy-to-sample terminal distribution to the target posterior, using a warm-start strategy to initialize the particles. To approximate the posterior score, we…

December 10, 2025
How to Tame Your LLM: Semantic Collapse in Continuous Systems

How to Tame Your LLM: Semantic Collapse in Continuous Systems arXiv:2512.05162v1 Announce Type: new Abstract: We develop a general theory of semantic dynamics for large language models by formalizing them as Continuous State Machines (CSMs): smooth dynamical systems whose latent manifolds evolve under probabilistic transition operators. The associated transfer operator $P: L^2(M,mu) to L^2(M,mu)$ encodes…

December 8, 2025
Novelty detection on path space

Novelty detection on path space arXiv:2512.03243v1 Announce Type: new Abstract: We frame novelty detection on path space as a hypothesis testing problem with signature-based test statistics. Using transportation-cost inequalities of Gasteratos and Jacquier (2023), we obtain tail bounds for false positive rates that extend beyond Gaussian measures to laws of RDE solutions with smooth bounded…

December 4, 2025
Algorithms and Scientific Software for Quasi-Monte Carlo, Fast Gaussian Process Regression, and Scientific Machine Learning

Algorithms and Scientific Software for Quasi-Monte Carlo, Fast Gaussian Process Regression, and Scientific Machine Learning arXiv:2511.21915v1 Announce Type: new Abstract: Most scientific domains elicit the development of efficient algorithms and accessible scientific software. This thesis unifies our developments in three broad domains: Quasi-Monte Carlo (QMC) methods for efficient high-dimensional integration, Gaussian process (GP) regression for…

December 1, 2025
Precise asymptotic analysis of Sobolev training for random feature models

Precise asymptotic analysis of Sobolev training for random feature models arXiv:2511.03050v1 Announce Type: new Abstract: Gradient information is widely useful and available in applications, and is therefore natural to include in the training of neural networks. Yet little is known theoretically about the impact of Sobolev training — regression with both function and gradient data…

November 6, 2025
Limit Theorems for Stochastic Gradient Descent in High-Dimensional Single-Layer Networks

Limit Theorems for Stochastic Gradient Descent in High-Dimensional Single-Layer Networks arXiv:2511.02258v1 Announce Type: new Abstract: This paper studies the high-dimensional scaling limits of online stochastic gradient descent (SGD) for single-layer networks. Building on the seminal work of Saad and Solla, which analyzed the deterministic (ballistic) scaling limits of SGD corresponding to the gradient flow of…

November 5, 2025
Accuracy estimation of neural networks by extreme value theory

Accuracy estimation of neural networks by extreme value theory arXiv:2511.00490v1 Announce Type: new Abstract: Neural networks are able to approximate any continuous function on a compact set. However, it is not obvious how to quantify the error of the neural network, i.e., the remaining bias between the function and the neural network. Here, we propose…

November 4, 2025
Exponential Convergence Guarantees for Iterative Markovian Fitting

Exponential Convergence Guarantees for Iterative Markovian Fitting arXiv:2510.20871v1 Announce Type: new Abstract: The Schr”odinger Bridge (SB) problem has become a fundamental tool in computational optimal transport and generative modeling. To address this problem, ideal methods such as Iterative Proportional Fitting and Iterative Markovian Fitting (IMF) have been proposed-alongside practical approximations like Diffusion Schr”odinger Bridge and…

October 27, 2025
Exact Dynamics of Multi-class Stochastic Gradient Descent

Exact Dynamics of Multi-class Stochastic Gradient Descent arXiv:2510.14074v1 Announce Type: new Abstract: We develop a framework for analyzing the training and learning rate dynamics on a variety of high- dimensional optimization problems trained using one-pass stochastic gradient descent (SGD) with data generated from multiple anisotropic classes. We give exact expressions for a large class of…

October 17, 2025
Dimension-Free Minimax Rates for Learning Pairwise Interactions in Attention-Style Models

Dimension-Free Minimax Rates for Learning Pairwise Interactions in Attention-Style Models arXiv:2510.11789v1 Announce Type: new Abstract: We study the convergence rate of learning pairwise interactions in single-layer attention-style models, where tokens interact through a weight matrix and a non-linear activation function. We prove that the minimax rate is $M^{-frac{2beta}{2beta+1}}$ with $M$ being the sample size, depending…

October 15, 2025
Distributionally robust approximation property of neural networks

Distributionally robust approximation property of neural networks arXiv:2510.09177v1 Announce Type: new Abstract: The universal approximation property uniformly with respect to weakly compact families of measures is established for several classes of neural networks. To that end, we prove that these neural networks are dense in Orlicz spaces, thereby extending classical universal approximation theorems even beyond…

October 13, 2025
Gaussian Equivalence for Self-Attention: Asymptotic Spectral Analysis of Attention Matrix

Gaussian Equivalence for Self-Attention: Asymptotic Spectral Analysis of Attention Matrix arXiv:2510.06685v1 Announce Type: new Abstract: Self-attention layers have become fundamental building blocks of modern deep neural networks, yet their theoretical understanding remains limited, particularly from the perspective of random matrix theory. In this work, we provide a rigorous analysis of the singular value spectrum of…

October 9, 2025
Minima and Critical Points of the Bethe Free Energy Are Invariant Under Deformation Retractions of Factor Graphs

Minima and Critical Points of the Bethe Free Energy Are Invariant Under Deformation Retractions of Factor Graphs arXiv:2510.05380v1 Announce Type: new Abstract: In graphical models, factor graphs, and more generally energy-based models, the interactions between variables are encoded by a graph, a hypergraph, or, in the most general case, a partially ordered set (poset). Inference…

October 8, 2025
Concept activation vectors: a unifying view and adversarial attacks

Concept activation vectors: a unifying view and adversarial attacks arXiv:2509.22755v1 Announce Type: new Abstract: Concept Activation Vectors (CAVs) are a tool from explainable AI, offering a promising approach for understanding how human-understandable concepts are encoded in a model’s latent spaces. They are computed from hidden-layer activations of inputs belonging either to a concept class or…

September 30, 2025
Effective continuous equations for adaptive SGD: a stochastic analysis view

Effective continuous equations for adaptive SGD: a stochastic analysis view arXiv:2509.21614v1 Announce Type: new Abstract: We present a theoretical analysis of some popular adaptive Stochastic Gradient Descent (SGD) methods in the small learning rate regime. Using the stochastic modified equations framework introduced by Li et al., we derive effective continuous stochastic dynamics for these methods.…

September 29, 2025
Anchored Langevin Algorithms

Anchored Langevin Algorithms arXiv:2509.19455v1 Announce Type: new Abstract: Standard first-order Langevin algorithms such as the unadjusted Langevin algorithm (ULA) are obtained by discretizing the Langevin diffusion and are widely used for sampling in machine learning because they scale to high dimensions and large datasets. However, they face two key limitations: (i) they require differentiable log-densities,…

September 25, 2025
Phase Transition for Stochastic Block Model with more than $sqrt{n}$ Communities

Phase Transition for Stochastic Block Model with more than $sqrt{n}$ Communities arXiv:2509.15822v1 Announce Type: new Abstract: Predictions from statistical physics postulate that recovery of the communities in Stochastic Block Model (SBM) is possible in polynomial time above, and only above, the Kesten-Stigum (KS) threshold. This conjecture has given rise to a rich literature, proving that…

September 22, 2025
A hierarchical entropy method for the delocalization of bias in high-dimensional Langevin Monte Carlo

A hierarchical entropy method for the delocalization of bias in high-dimensional Langevin Monte Carlo arXiv:2509.08619v1 Announce Type: new Abstract: The unadjusted Langevin algorithm is widely used for sampling from complex high-dimensional distributions. It is well known to be biased, with the bias typically scaling linearly with the dimension when measured in squared Wasserstein distance. However,…

September 11, 2025
An invertible generative model for forward and inverse problems

An invertible generative model for forward and inverse problems arXiv:2509.03910v1 Announce Type: new Abstract: We formulate the inverse problem in a Bayesian framework and aim to train a generative model that allows us to simulate (i.e., sample from the likelihood) and do inference (i.e., sample from the posterior). We review the use of triangular normalizing…

September 5, 2025
Scale-Adaptive Generative Flows for Multiscale Scientific Data

Scale-Adaptive Generative Flows for Multiscale Scientific Data arXiv:2509.02971v1 Announce Type: new Abstract: Flow-based generative models can face significant challenges when modeling scientific data with multiscale Fourier spectra, often producing large errors in fine-scale features. We address this problem within the framework of stochastic interpolants, via principled design of noise distributions and interpolation schedules. The key…

September 4, 2025
Assessing One-Dimensional Cluster Stability by Extreme-Point Trimming

Assessing One-Dimensional Cluster Stability by Extreme-Point Trimming arXiv:2509.00258v1 Announce Type: new Abstract: We develop a probabilistic method for assessing the tail behavior and geometric stability of one-dimensional n i.i.d. samples by tracking how their span contracts when the most extreme points are trimmed. Central to our approach is the diameter-shrinkage ratio, that quantifies the relative…

September 3, 2025
Underdamped Langevin MCMC with third order convergence

Underdamped Langevin MCMC with third order convergence arXiv:2508.16485v1 Announce Type: new Abstract: In this paper, we propose a new numerical method for the underdamped Langevin diffusion (ULD) and present a non-asymptotic analysis of its sampling error in the 2-Wasserstein distance when the $d$-dimensional target distribution $p(x)propto e^{-f(x)}$ is strongly log-concave and has varying degrees of…

August 25, 2025
Optimal Subspace Embeddings: Resolving Nelson-Nguyen Conjecture Up to Sub-Polylogarithmic Factors

Optimal Subspace Embeddings: Resolving Nelson-Nguyen Conjecture Up to Sub-Polylogarithmic Factors arXiv:2508.14234v1 Announce Type: cross Abstract: We give a proof of the conjecture of Nelson and Nguyen [FOCS 2013] on the optimal dimension and sparsity of oblivious subspace embeddings, up to sub-polylogarithmic factors: For any $ngeq d$ and $epsilongeq d^{-O(1)}$, there is a random $tilde O(d/epsilon^2)times…

August 21, 2025
Nonparametric learning of stochastic differential equations from sparse and noisy data

Nonparametric learning of stochastic differential equations from sparse and noisy data arXiv:2508.11597v1 Announce Type: new Abstract: The paper proposes a systematic framework for building data-driven stochastic differential equation (SDE) models from sparse, noisy observations. Unlike traditional parametric approaches, which assume a known functional form for the drift, our goal here is to learn the entire…

August 18, 2025
Dimension-Free Bounds for Generalized First-Order Methods via Gaussian Coupling

Dimension-Free Bounds for Generalized First-Order Methods via Gaussian Coupling arXiv:2508.10782v1 Announce Type: new Abstract: We establish non-asymptotic bounds on the finite-sample behavior of generalized first-order iterative algorithms — including gradient-based optimization methods and approximate message passing (AMP) — with Gaussian data matrices and full-memory, non-separable nonlinearities. The central result constructs an explicit coupling between the…

August 15, 2025
On Experiments

On Experiments arXiv:2508.08288v1 Announce Type: new Abstract: The scientific process is a means for turning the results of experiments into knowledge about the world in which we live. Much research effort has been directed toward automating this process. To do this, one needs to formulate the scientific process in a precise mathematical language. This paper…

August 13, 2025
Inequalities for Optimization of Classification Algorithms: A Perspective Motivated by Diagnostic Testing

Inequalities for Optimization of Classification Algorithms: A Perspective Motivated by Diagnostic Testing arXiv:2508.01065v1 Announce Type: new Abstract: Motivated by canonical problems in medical diagnostics, we propose and study properties of an objective function that uniformly bounds uncertainties in quantities of interest extracted from classifiers and related data analysis tools. We begin by adopting a set-theoretic…

August 5, 2025
Regime-Aware Conditional Neural Processes with Multi-Criteria Decision Support for Operational Electricity Price Forecasting

Regime-Aware Conditional Neural Processes with Multi-Criteria Decision Support for Operational Electricity Price Forecasting arXiv:2508.00040v1 Announce Type: cross Abstract: This work integrates Bayesian regime detection with conditional neural processes for 24-hour electricity price prediction in the German market. Our methodology integrates regime detection using a disentangled sticky hierarchical Dirichlet process hidden Markov model (DS-HDP-HMM) applied to…

August 4, 2025
Simulating Posterior Bayesian Neural Networks with Dependent Weights

Simulating Posterior Bayesian Neural Networks with Dependent Weights arXiv:2507.22095v1 Announce Type: new Abstract: In this paper we consider posterior Bayesian fully connected and feedforward deep neural networks with dependent weights. Particularly, if the likelihood is Gaussian, we identify the distribution of the wide width limit and provide an algorithm to sample from the network. In…

July 31, 2025
Central limit theorems for the eigenvalues of graph Laplacians on data clouds

Central limit theorems for the eigenvalues of graph Laplacians on data clouds arXiv:2507.18803v1 Announce Type: new Abstract: Given i.i.d. samples $X_n ={ x_1, dots, x_n }$ from a distribution supported on a low dimensional manifold ${M}$ embedded in Eucliden space, we consider the graph Laplacian operator $Delta_n$ associated to an $varepsilon$-proximity graph over $X_n$ and…

July 28, 2025
Finite-Dimensional Gaussian Approximation for Deep Neural Networks: Universality in Random Weights

Finite-Dimensional Gaussian Approximation for Deep Neural Networks: Universality in Random Weights arXiv:2507.12686v1 Announce Type: new Abstract: We study the Finite-Dimensional Distributions (FDDs) of deep neural networks with randomly initialized weights that have finite-order moments. Specifically, we establish Gaussian approximation bounds in the Wasserstein-$1$ norm between the FDDs and their Gaussian limit assuming a Lipschitz activation…

July 18, 2025
Mallows Model with Learned Distance Metrics: Sampling and Maximum Likelihood Estimation

Mallows Model with Learned Distance Metrics: Sampling and Maximum Likelihood Estimation arXiv:2507.08108v1 Announce Type: new Abstract: textit{Mallows model} is a widely-used probabilistic framework for learning from ranking data, with applications ranging from recommendation systems and voting to aligning language models with human preferences~cite{chen2024mallows, kleinberg2021algorithmic, rafailov2024direct}. Under this model, observed rankings are noisy perturbations of a…

July 14, 2025
A Malliavin calculus approach to score functions in diffusion generative models

A Malliavin calculus approach to score functions in diffusion generative models arXiv:2507.05550v1 Announce Type: new Abstract: Score-based diffusion generative models have recently emerged as a powerful tool for modelling complex data distributions. These models aim at learning the score function, which defines a map from a known probability distribution to the target data distribution via…

July 9, 2025
Asymptotic convexity of wide and shallow neural networks

Asymptotic convexity of wide and shallow neural networks arXiv:2507.01044v1 Announce Type: new Abstract: For a simple model of shallow and wide neural networks, we show that the epigraph of its input-output map as a function of the network parameters approximates epigraph of a. convex function in a precise sense. This leads to a plausible explanation…

July 3, 2025
Strategic A/B testing via Maximum Probability-driven Two-armed Bandit

Strategic A/B testing via Maximum Probability-driven Two-armed Bandit arXiv:2506.22536v1 Announce Type: new Abstract: Detecting a minor average treatment effect is a major challenge in large-scale applications, where even minimal improvements can have a significant economic impact. Traditional methods, reliant on normal distribution-based or expanded statistics, often fail to identify such minor effects because of their…

July 1, 2025
Data-Driven Dynamic Factor Modeling via Manifold Learning

Data-Driven Dynamic Factor Modeling via Manifold Learning arXiv:2506.19945v1 Announce Type: new Abstract: We propose a data-driven dynamic factor framework where a response variable depends on a high-dimensional set of covariates, without imposing any parametric model on the joint dynamics. Leveraging Anisotropic Diffusion Maps, a nonlinear manifold learning technique introduced by Singer and Coifman, our framework…

June 26, 2025
Near-optimal estimates for the $ell^p$-Lipschitz constants of deep random ReLU neural networks

Near-optimal estimates for the $ell^p$-Lipschitz constants of deep random ReLU neural networks arXiv:2506.19695v1 Announce Type: new Abstract: This paper studies the $ell^p$-Lipschitz constants of ReLU neural networks $Phi: mathbb{R}^d to mathbb{R}$ with random parameters for $p in [1,infty]$. The distribution of the weights follows a variant of the He initialization and the biases are drawn…

June 25, 2025
Gaussian Processes and Reproducing Kernels: Connections and Equivalences

Gaussian Processes and Reproducing Kernels: Connections and Equivalences arXiv:2506.17366v1 Announce Type: new Abstract: This monograph studies the relations between two approaches using positive definite kernels: probabilistic methods using Gaussian processes, and non-probabilistic methods using reproducing kernel Hilbert spaces (RKHS). They are widely studied and used in machine learning, statistics, and numerical analysis. Connections and equivalences…

June 24, 2025
Scalable Machine Learning Algorithms using Path Signatures

Scalable Machine Learning Algorithms using Path Signatures arXiv:2506.17634v1 Announce Type: new Abstract: The interface between stochastic analysis and machine learning is a rapidly evolving field, with path signatures – iterated integrals that provide faithful, hierarchical representations of paths – offering a principled and universal feature map for sequential and structured data. Rooted in rough path…

June 24, 2025
Sampling conditioned diffusions via Pathspace Projected Monte Carlo

Sampling conditioned diffusions via Pathspace Projected Monte Carlo arXiv:2506.15743v1 Announce Type: new Abstract: We present an algorithm to sample stochastic differential equations conditioned on rather general constraints, including integral constraints, endpoint constraints, and stochastic integral constraints. The algorithm is a pathspace Metropolis-adjusted manifold sampling scheme, which samples stochastic paths on the submanifold of realizations that…

June 23, 2025
Rademacher learning rates for iterated random functions

Rademacher learning rates for iterated random functions arXiv:2506.13946v1 Announce Type: new Abstract: Most existing literature on supervised machine learning assumes that the training dataset is drawn from an i.i.d. sample. However, many real-world problems exhibit temporal dependence and strong correlations between the marginal distributions of the data-generating process, suggesting that the i.i.d. assumption is often…

June 18, 2025
Enabling Probabilistic Learning on Manifolds through Double Diffusion Maps

Enabling Probabilistic Learning on Manifolds through Double Diffusion Maps arXiv:2506.02254v1 Announce Type: new Abstract: We present a generative learning framework for probabilistic sampling based on an extension of the Probabilistic Learning on Manifolds (PLoM) approach, which is designed to generate statistically consistent realizations of a random vector in a finite-dimensional Euclidean space, informed by a…

June 4, 2025
A General-Purpose Theorem for High-Probability Bounds of Stochastic Approximation with Polyak Averaging

A General-Purpose Theorem for High-Probability Bounds of Stochastic Approximation with Polyak Averaging arXiv:2505.21796v1 Announce Type: new Abstract: Polyak-Ruppert averaging is a widely used technique to achieve the optimal asymptotic variance of stochastic approximation (SA) algorithms, yet its high-probability performance guarantees remain underexplored in general settings. In this paper, we present a general framework for establishing…

May 29, 2025
Liouville PDE-based sliced-Wasserstein flow for fair regression

Liouville PDE-based sliced-Wasserstein flow for fair regression arXiv:2505.17204v1 Announce Type: new Abstract: The sliced Wasserstein flow (SWF), a nonparametric and implicit generative gradient flow, is applied to fair regression. We have improved the SWF in a few aspects. First, the stochastic diffusive term from the Fokker-Planck equation-based Monte Carlo is transformed to Liouville partial differential…

May 26, 2025
An Exponential Averaging Process with Strong Convergence Properties

An Exponential Averaging Process with Strong Convergence Properties arXiv:2505.10605v1 Announce Type: new Abstract: Averaging, or smoothing, is a fundamental approach to obtain stable, de-noised estimates from noisy observations. In certain scenarios, observations made along trajectories of random dynamical systems are of particular interest. One popular smoothing technique for such a scenario is exponential moving averaging…

May 19, 2025
Minimax learning rates for estimating binary classifiers under margin conditions

Minimax learning rates for estimating binary classifiers under margin conditions arXiv:2505.10628v1 Announce Type: new Abstract: We study classification problems using binary estimators where the decision boundary is described by horizon functions and where the data distribution satisfies a geometric margin condition. We establish upper and lower bounds for the minimax learning rate over broad function…

May 19, 2025
Optimal Transport-Based Domain Adaptation for Rotated Linear Regression

Optimal Transport-Based Domain Adaptation for Rotated Linear Regression arXiv:2505.09229v1 Announce Type: new Abstract: Optimal Transport (OT) has proven effective for domain adaptation (DA) by aligning distributions across domains with differing statistical properties. Building on the approach of Courty et al. (2016), who mapped source data to the target domain for improved model transfer, we focus…

May 15, 2025
Diffusion-based supervised learning of generative models for efficient sampling of multimodal distributions

Diffusion-based supervised learning of generative models for efficient sampling of multimodal distributions arXiv:2505.07825v1 Announce Type: new Abstract: We propose a hybrid generative model for efficient sampling of high-dimensional, multimodal probability distributions for Bayesian inference. Traditional Monte Carlo methods, such as the Metropolis-Hastings and Langevin Monte Carlo sampling methods, are effective for sampling from single-mode distributions…

May 14, 2025
Feature Representation Transferring to Lightweight Models via Perception Coherence

Feature Representation Transferring to Lightweight Models via Perception Coherence arXiv:2505.06595v1 Announce Type: new Abstract: In this paper, we propose a method for transferring feature representation to lightweight student models from larger teacher models. We mathematically define a new notion called textit{perception coherence}. Based on this notion, we propose a loss function, which takes into account…

May 13, 2025
Physics-Informed Inference Time Scaling via Simulation-Calibrated Scientific Machine Learning

Physics-Informed Inference Time Scaling via Simulation-Calibrated Scientific Machine Learning arXiv:2504.16172v1 Announce Type: cross Abstract: High-dimensional partial differential equations (PDEs) pose significant computational challenges across fields ranging from quantum chemistry to economics and finance. Although scientific machine learning (SciML) techniques offer approximate solutions, they often suffer from bias and neglect crucial physical insights. Inspired by inference-time…

April 24, 2025
Throughput-Optimal Scheduling Algorithms for LLM Inference and AI Agents

Throughput-Optimal Scheduling Algorithms for LLM Inference and AI Agents arXiv:2504.07347v1 Announce Type: new Abstract: As demand for Large Language Models (LLMs) and AI agents rapidly grows, optimizing systems for efficient LLM inference becomes critical. While significant efforts have targeted system-level engineering, little is explored through a mathematical modeling and queuing perspective. In this paper, we…

April 11, 2025
Performance of Rank-One Tensor Approximation on Incomplete Data

Performance of Rank-One Tensor Approximation on Incomplete Data arXiv:2504.07818v1 Announce Type: new Abstract: We are interested in the estimation of a rank-one tensor signal when only a portion $varepsilon$ of its noisy observation is available. We show that the study of this problem can be reduced to that of a random matrix model whose spectral…

April 11, 2025
Smoothed Distance Kernels for MMDs and Applications in Wasserstein Gradient Flows

Smoothed Distance Kernels for MMDs and Applications in Wasserstein Gradient Flows arXiv:2504.07820v1 Announce Type: new Abstract: Negative distance kernels $K(x,y) := – |x-y|$ were used in the definition of maximum mean discrepancies (MMDs) in statistics and lead to favorable numerical results in various applications. In particular, so-called slicing techniques for handling high-dimensional kernel summations profit…

April 11, 2025
High-dimensional ridge regression with random features for non-identically distributed data with a variance profile

High-dimensional ridge regression with random features for non-identically distributed data with a variance profile arXiv:2504.03035v1 Announce Type: new Abstract: The behavior of the random feature model in the high-dimensional regression framework has become a popular issue of interest in the machine learning literature}. This model is generally considered for feature vectors $x_i = Sigma^{1/2} x_i’$,…

April 7, 2025
A computational transition for detecting multivariate shuffled linear regression by low-degree polynomials

A computational transition for detecting multivariate shuffled linear regression by low-degree polynomials arXiv:2504.03097v1 Announce Type: new Abstract: In this paper, we study the problem of multivariate shuffled linear regression, where the correspondence between predictors and responses in a linear model is obfuscated by a latent permutation. Specifically, we investigate the model $Y=tfrac{1}{sqrt{1+sigma^2}}(Pi_* X Q_* +…

April 7, 2025
Denoising guarantees for optimized sampling schemes in compressed sensing

Denoising guarantees for optimized sampling schemes in compressed sensing arXiv:2504.01046v1 Announce Type: new Abstract: Compressed sensing with subsampled unitary matrices benefits from emph{optimized} sampling schemes, which feature improved theoretical guarantees and empirical performance relative to uniform subsampling. We provide, in a first of its kind in compressed sensing, theoretical guarantees showing that the error caused…

April 3, 2025
Feature learning from non-Gaussian inputs: the case of Independent Component Analysis in high dimensions

Feature learning from non-Gaussian inputs: the case of Independent Component Analysis in high dimensions arXiv:2503.23896v1 Announce Type: new Abstract: Deep neural networks learn structured features from complex, non-Gaussian inputs, but the mechanisms behind this process remain poorly understood. Our work is motivated by the observation that the first-layer filters learnt by deep convolutional neural networks…

April 1, 2025
A stochastic gradient descent algorithm with random search directions

A stochastic gradient descent algorithm with random search directions arXiv:2503.19942v1 Announce Type: new Abstract: Stochastic coordinate descent algorithms are efficient methods in which each iterate is obtained by fixing most coordinates at their values from the current iteration, and approximately minimizing the objective with respect to the remaining coordinates. However, this approach is usually restricted…

March 27, 2025
Procrustes Wasserstein Metric: A Modified Benamou-Brenier Approach with Applications to Latent Gaussian Distributions

Procrustes Wasserstein Metric: A Modified Benamou-Brenier Approach with Applications to Latent Gaussian Distributions arXiv:2503.16580v1 Announce Type: new Abstract: We introduce a modified Benamou-Brenier type approach leading to a Wasserstein type distance that allows global invariance, specifically, isometries, and we show that the problem can be summarized to orthogonal transformations. This distance is defined by penalizing…

March 24, 2025
Optimal Nonlinear Online Learning under Sequential Price Competition via s-Concavity

Optimal Nonlinear Online Learning under Sequential Price Competition via s-Concavity arXiv:2503.16737v1 Announce Type: new Abstract: We consider price competition among multiple sellers over a selling horizon of $T$ periods. In each period, sellers simultaneously offer their prices and subsequently observe their respective demand that is unobservable to competitors. The demand function for each seller depends…

March 24, 2025
Nonlinear Bayesian Update via Ensemble Kernel Regression with Clustering and Subsampling

Nonlinear Bayesian Update via Ensemble Kernel Regression with Clustering and Subsampling arXiv:2503.15160v1 Announce Type: new Abstract: Nonlinear Bayesian update for a prior ensemble is proposed to extend traditional ensemble Kalman filtering to settings characterized by non-Gaussian priors and nonlinear measurement operators. In this framework, the observed component is first denoised via a standard Kalman update,…

March 20, 2025
On Statistical Estimation of Edge-Reinforced Random Walks

On Statistical Estimation of Edge-Reinforced Random Walks arXiv:2503.06115v1 Announce Type: new Abstract: Reinforced random walks (RRWs), including vertex-reinforced random walks (VRRWs) and edge-reinforced random walks (ERRWs), model random walks where the transition probabilities evolve based on prior visitation history~cite{mgr, fmk, tarres, volkov}. These models have found applications in various areas, such as network representation learning~cite{xzzs},…

March 11, 2025
A characterization of sample adaptivity in UCB data

A characterization of sample adaptivity in UCB data arXiv:2503.04855v1 Announce Type: new Abstract: We characterize a joint CLT of the number of pulls and the sample mean reward of the arms in a stochastic two-armed bandit environment under UCB algorithms. Several implications of this result are in place: (1) a nonstandard CLT of the number…

March 10, 2025
Applications of Entropy in Data Analysis and Machine Learning: A Review

Applications of Entropy in Data Analysis and Machine Learning: A Review arXiv:2503.02921v1 Announce Type: new Abstract: Since its origin in the thermodynamics of the 19th century, the concept of entropy has also permeated other fields of physics and mathematics, such as Classical and Quantum Statistical Mechanics, Information Theory, Probability Theory, Ergodic Theory and the Theory…

March 6, 2025
Efficient Risk-sensitive Planning via Entropic Risk Measures

Efficient Risk-sensitive Planning via Entropic Risk Measures arXiv:2502.20423v1 Announce Type: new Abstract: Risk-sensitive planning aims to identify policies maximizing some tail-focused metrics in Markov Decision Processes (MDPs). Such an optimization task can be very costly for the most widely used and interpretable metrics such as threshold probabilities or (Conditional) Values at Risk. Indeed, previous work…

March 3, 2025
Algorithmic contiguity from low-degree conjecture and applications in correlated random graphs

Algorithmic contiguity from low-degree conjecture and applications in correlated random graphs arXiv:2502.09832v1 Announce Type: new Abstract: In this paper, assuming a natural strengthening of the low-degree conjecture, we provide evidence of computational hardness for two problems: (1) the (partial) matching recovery problem in the sparse correlated ErdH{o}s-R’enyi graphs $mathcal G(n,q;rho)$ when the edge-density $q=n^{-1+o(1)}$ and…

February 17, 2025
Non-asymptotic Analysis of Diffusion Annealed Langevin Monte Carlo for Generative Modelling

Non-asymptotic Analysis of Diffusion Annealed Langevin Monte Carlo for Generative Modelling arXiv:2502.09306v1 Announce Type: new Abstract: We investigate the theoretical properties of general diffusion (interpolation) paths and their Langevin Monte Carlo implementation, referred to as diffusion annealed Langevin Monte Carlo (DALMC), under weak conditions on the data distribution. Specifically, we analyse and provide non-asymptotic error…

February 14, 2025
Poisson Hierarchical Indian Buffet Processes for Within and Across Group Sharing of Latent Features-With Indications for Microbiome Species Sampling Models

Poisson Hierarchical Indian Buffet Processes for Within and Across Group Sharing of Latent Features-With Indications for Microbiome Species Sampling Models arXiv:2502.01919v1 Announce Type: new Abstract: In this work, we present a comprehensive Bayesian posterior analysis of what we term Poisson Hierarchical Indian Buffet Processes, designed for complex random sparse count species sampling models that allow…

February 5, 2025
Statistical Verification of Linear Classifiers

Statistical Verification of Linear Classifiers arXiv:2501.14430v1 Announce Type: new Abstract: We propose a homogeneity test closely related to the concept of linear separability between two samples. Using the test one can answer the question whether a linear classifier is merely “random” or effectively captures differences between two classes. We focus on establishing upper bounds for…

January 27, 2025
Simulation of Random LR Fuzzy Intervals

Simulation of Random LR Fuzzy Intervals arXiv:2501.10482v1 Announce Type: new Abstract: Random fuzzy variables join the modeling of the impreciseness (due to their “fuzzy part”) and randomness. Statistical samples of such objects are widely used, and their direct, numerically effective generation is therefore necessary. Usually, these samples consist of triangular or trapezoidal fuzzy numbers. In…

January 22, 2025
Generative Models with ELBOs Converging to Entropy Sums

Generative Models with ELBOs Converging to Entropy Sums arXiv:2501.09022v1 Announce Type: new Abstract: The evidence lower bound (ELBO) is one of the most central objectives for probabilistic unsupervised learning. For the ELBOs of several generative models and model classes, we here prove convergence to entropy sums. As one result, we provide a list of generative…

January 17, 2025
Avoiding subtraction and division of stochastic signals using normalizing flows: NFdeconvolve

Avoiding subtraction and division of stochastic signals using normalizing flows: NFdeconvolve arXiv:2501.08288v1 Announce Type: new Abstract: Across the scientific realm, we find ourselves subtracting or dividing stochastic signals. For instance, consider a stochastic realization, $x$, generated from the addition or multiplication of two stochastic signals $a$ and $b$, namely $x=a+b$ or $x = ab$. For…

January 15, 2025
Robust random graph matching in dense graphs via vector approximate message passing

Robust random graph matching in dense graphs via vector approximate message passing arXiv:2412.16457v1 Announce Type: new Abstract: In this paper, we focus on the matching recovery problem between a pair of correlated Gaussian Wigner matrices with a latent vertex correspondence. We are particularly interested in a robust version of this problem such that our observation…

December 24, 2024
Generative Modeling with Diffusion

Generative Modeling with Diffusion arXiv:2412.10948v1 Announce Type: new Abstract: We introduce the diffusion model as a method to generate new samples. Generative models have been recently adopted for tasks such as art generation (Stable Diffusion, Dall-E) and text generation (ChatGPT). Diffusion models in particular apply noise to sample data and then “reverse” this noising process…

December 17, 2024
Nonparametric Filtering, Estimation and Classification using Neural Jump ODEs

Nonparametric Filtering, Estimation and Classification using Neural Jump ODEs arXiv:2412.03271v1 Announce Type: new Abstract: Neural Jump ODEs model the conditional expectation between observations by neural ODEs and jump at arrival of new observations. They have demonstrated effectiveness for fully data-driven online forecasting in settings with irregular and partial observations, operating under weak regularity assumptions. This…

December 5, 2024
Selective Reviews of Bandit Problems in AI via a Statistical View

Selective Reviews of Bandit Problems in AI via a Statistical View arXiv:2412.02251v1 Announce Type: new Abstract: Reinforcement Learning (RL) is a widely researched area in artificial intelligence that focuses on teaching agents decision-making through interactions with their environment. A key subset includes stochastic multi-armed bandit (MAB) and continuum-armed bandit (SCAB) problems, which model sequential decision-making…

December 4, 2024