Category: stat.ML
-
Friction on Demand: A Generative Framework for the Inverse Design of Metainterfaces
Friction on Demand: A Generative Framework for the Inverse Design of Metainterfaces arXiv:2511.03735v1 Announce Type: new Abstract: Designing frictional interfaces to exhibit prescribed macroscopic behavior is a challenging inverse problem, made difficult by the non-uniqueness of solutions and the computational cost of contact simulations. Traditional approaches rely on heuristic search over low-dimensional parameterizations, which limits…
-
Bifidelity Karhunen-Lo`eve Expansion Surrogate with Active Learning for Random Fields
Bifidelity Karhunen-Lo`eve Expansion Surrogate with Active Learning for Random Fields arXiv:2511.03756v1 Announce Type: new Abstract: We present a bifidelity Karhunen-Lo`eve expansion (KLE) surrogate model for field-valued quantities of interest (QoIs) under uncertain inputs. The approach combines the spectral efficiency of the KLE with polynomial chaos expansions (PCEs) to preserve an explicit mapping between input uncertainties…
-
Learning Paths for Dynamic Measure Transport: A Control Perspective
Learning Paths for Dynamic Measure Transport: A Control Perspective arXiv:2511.03797v1 Announce Type: new Abstract: We bring a control perspective to the problem of identifying paths of measures for sampling via dynamic measure transport (DMT). We highlight the fact that commonly used paths may be poor choices for DMT and connect existing methods for learning alternate…
-
A general technique for approximating high-dimensional empirical kernel matrices
A general technique for approximating high-dimensional empirical kernel matrices arXiv:2511.03892v1 Announce Type: new Abstract: We present simple, user-friendly bounds for the expected operator norm of a random kernel matrix under general conditions on the kernel function $k(cdot,cdot)$. Our approach uses decoupling results for U-statistics and the non-commutative Khintchine inequality to obtain upper and lower bounds…
-
High-dimensional limit theorems for SGD: Momentum and Adaptive Step-sizes
High-dimensional limit theorems for SGD: Momentum and Adaptive Step-sizes arXiv:2511.03952v1 Announce Type: new Abstract: We develop a high-dimensional scaling limit for Stochastic Gradient Descent with Polyak Momentum (SGD-M) and adaptive step-sizes. This provides a framework to rigourously compare online SGD with some of its popular variants. We show that the scaling limits of SGD-M coincide…
-
Scalable Single-Cell Gene Expression Generation with Latent Diffusion Models
Scalable Single-Cell Gene Expression Generation with Latent Diffusion Models arXiv:2511.02986v1 Announce Type: new Abstract: Computational modeling of single-cell gene expression is crucial for understanding cellular processes, but generating realistic expression profiles remains a major challenge. This difficulty arises from the count nature of gene expression data and complex latent dependencies among genes. Existing generative models…
-
Unifying Information-Theoretic and Pair-Counting Clustering Similarity
Unifying Information-Theoretic and Pair-Counting Clustering Similarity arXiv:2511.03000v1 Announce Type: new Abstract: Comparing clusterings is central to evaluating unsupervised models, yet the many existing similarity measures can produce widely divergent, sometimes contradictory, evaluations. Clustering similarity measures are typically organized into two principal families, pair-counting and information-theoretic, reflecting whether they quantify agreement through element pairs or aggregate…
-
Precise asymptotic analysis of Sobolev training for random feature models
Precise asymptotic analysis of Sobolev training for random feature models arXiv:2511.03050v1 Announce Type: new Abstract: Gradient information is widely useful and available in applications, and is therefore natural to include in the training of neural networks. Yet little is known theoretically about the impact of Sobolev training — regression with both function and gradient data…
-
Provable Separations between Memorization and Generalization in Diffusion Models
Provable Separations between Memorization and Generalization in Diffusion Models arXiv:2511.03202v1 Announce Type: new Abstract: Diffusion models have achieved remarkable success across diverse domains, but they remain vulnerable to memorization — reproducing training data rather than generating novel outputs. This not only limits their creative potential but also raises concerns about privacy and safety. While empirical…
-
Provable Accelerated Bayesian Optimization with Knowledge Transfer
Provable Accelerated Bayesian Optimization with Knowledge Transfer arXiv:2511.03125v1 Announce Type: new Abstract: We study how Bayesian optimization (BO) can be accelerated on a target task with historical knowledge transferred from related source tasks. Existing works on BO with knowledge transfer either do not have theoretical guarantees or achieve the same regret as BO in the…
-
Data-driven Learning of Interaction Laws in Multispecies Particle Systems with Gaussian Processes: Convergence Theory and Applications
Data-driven Learning of Interaction Laws in Multispecies Particle Systems with Gaussian Processes: Convergence Theory and Applications arXiv:2511.02053v1 Announce Type: new Abstract: We develop a Gaussian process framework for learning interaction kernels in multi-species interacting particle systems from trajectory data. Such systems provide a canonical setting for multiscale modeling, where simple microscopic interaction rules generate complex…
-
DoFlow: Causal Generative Flows for Interventional and Counterfactual Time-Series Prediction
DoFlow: Causal Generative Flows for Interventional and Counterfactual Time-Series Prediction arXiv:2511.02137v1 Announce Type: new Abstract: Time-series forecasting increasingly demands not only accurate observational predictions but also causal forecasting under interventional and counterfactual queries in multivariate systems. We present DoFlow, a flow based generative model defined over a causal DAG that delivers coherent observational and interventional…
-
Limit Theorems for Stochastic Gradient Descent in High-Dimensional Single-Layer Networks
Limit Theorems for Stochastic Gradient Descent in High-Dimensional Single-Layer Networks arXiv:2511.02258v1 Announce Type: new Abstract: This paper studies the high-dimensional scaling limits of online stochastic gradient descent (SGD) for single-layer networks. Building on the seminal work of Saad and Solla, which analyzed the deterministic (ballistic) scaling limits of SGD corresponding to the gradient flow of…
-
An Adaptive Sampling Framework for Detecting Localized Concept Drift under Label Scarcity
An Adaptive Sampling Framework for Detecting Localized Concept Drift under Label Scarcity arXiv:2511.02452v1 Announce Type: new Abstract: Concept drift and label scarcity are two critical challenges limiting the robustness of predictive models in dynamic industrial environments. Existing drift detection methods often assume global shifts and rely on dense supervision, making them ill-suited for regression tasks…
-
A new class of Markov random fields enabling lightweight sampling
A new class of Markov random fields enabling lightweight sampling arXiv:2511.02373v1 Announce Type: new Abstract: This work addresses the problem of efficient sampling of Markov random fields (MRF). The sampling of Potts or Ising MRF is most often based on Gibbs sampling, and is thus computationally expensive. We consider in this work how to circumvent…
-
Gradient Boosted Mixed Models: Flexible Joint Estimation of Mean and Variance Components for Clustered Data
Gradient Boosted Mixed Models: Flexible Joint Estimation of Mean and Variance Components for Clustered Data arXiv:2511.00217v1 Announce Type: new Abstract: Linear mixed models are widely used for clustered data, but their reliance on parametric forms limits flexibility in complex and high-dimensional settings. In contrast, gradient boosting methods achieve high predictive accuracy through nonparametric estimation, but…
-
A Streaming Sparse Cholesky Method for Derivative-Informed Gaussian Process Surrogates Within Digital Twin Applications
A Streaming Sparse Cholesky Method for Derivative-Informed Gaussian Process Surrogates Within Digital Twin Applications arXiv:2511.00366v1 Announce Type: new Abstract: Digital twins are developed to model the behavior of a specific physical asset (or twin), and they can consist of high-fidelity physics-based models or surrogates. A highly accurate surrogate is often preferred over multi-physics models as…
-
Accuracy estimation of neural networks by extreme value theory
Accuracy estimation of neural networks by extreme value theory arXiv:2511.00490v1 Announce Type: new Abstract: Neural networks are able to approximate any continuous function on a compact set. However, it is not obvious how to quantify the error of the neural network, i.e., the remaining bias between the function and the neural network. Here, we propose…
-
Perturbations in the Orthogonal Complement Subspace for Efficient Out-of-Distribution Detection
Perturbations in the Orthogonal Complement Subspace for Efficient Out-of-Distribution Detection arXiv:2511.00849v1 Announce Type: new Abstract: Out-of-distribution (OOD) detection is essential for deploying deep learning models in open-world environments. Existing approaches, such as energy-based scoring and gradient-projection methods, typically rely on high-dimensional representations to separate in-distribution (ID) and OOD samples. We introduce P-OCS (Perturbations in the…
-
SOCRATES: Simulation Optimization with Correlated Replicas and Adaptive Trajectory Evaluations
SOCRATES: Simulation Optimization with Correlated Replicas and Adaptive Trajectory Evaluations arXiv:2511.00685v1 Announce Type: new Abstract: The field of simulation optimization (SO) encompasses various methods developed to optimize complex, expensive-to-sample stochastic systems. Established methods include, but are not limited to, ranking-and-selection for finite alternatives and surrogate-based methods for continuous domains, with broad applications in engineering and…
-
Overspecified Mixture Discriminant Analysis: Exponential Convergence, Statistical Guarantees, and Remote Sensing Applications
Overspecified Mixture Discriminant Analysis: Exponential Convergence, Statistical Guarantees, and Remote Sensing Applications arXiv:2510.27056v1 Announce Type: new Abstract: This study explores the classification error of Mixture Discriminant Analysis (MDA) in scenarios where the number of mixture components exceeds those present in the actual data distribution, a condition known as overspecification. We use a two-component Gaussian mixture…
-
Decreasing Entropic Regularization Averaged Gradient for Semi-Discrete Optimal Transport
Decreasing Entropic Regularization Averaged Gradient for Semi-Discrete Optimal Transport arXiv:2510.27340v1 Announce Type: new Abstract: Adding entropic regularization to Optimal Transport (OT) problems has become a standard approach for designing efficient and scalable solvers. However, regularization introduces a bias from the true solution. To mitigate this bias while still benefiting from the acceleration provided by regularization,…
-
On the Equivalence of Optimal Transport Problem and Action Matching with Optimal Vector Fields
On the Equivalence of Optimal Transport Problem and Action Matching with Optimal Vector Fields arXiv:2510.27385v1 Announce Type: new Abstract: Flow Matching (FM) method in generative modeling maps arbitrary probability distributions by constructing an interpolation between them and then learning the vector field that defines ODE for this interpolation. Recently, it was shown that FM can…
-
Minimax-Optimal Two-Sample Test with Sliced Wasserstein
Minimax-Optimal Two-Sample Test with Sliced Wasserstein arXiv:2510.27498v1 Announce Type: new Abstract: We study the problem of nonparametric two-sample testing using the sliced Wasserstein (SW) distance. While prior theoretical and empirical work indicates that the SW distance offers a promising balance between strong statistical guarantees and computational efficiency, its theoretical foundations for hypothesis testing remain limited.…
-
Interpretable Model-Aware Counterfactual Explanations for Random Forest
Interpretable Model-Aware Counterfactual Explanations for Random Forest arXiv:2510.27397v1 Announce Type: new Abstract: Despite their enormous predictive power, machine learning models are often unsuitable for applications in regulated industries such as finance, due to their limited capacity to provide explanations. While model-agnostic frameworks such as Shapley values have proved to be convenient and popular, they rarely…
-
Multimodal Bandits: Regret Lower Bounds and Optimal Algorithms
Multimodal Bandits: Regret Lower Bounds and Optimal Algorithms arXiv:2510.25811v1 Announce Type: new Abstract: We consider a stochastic multi-armed bandit problem with i.i.d. rewards where the expected reward function is multimodal with at most m modes. We propose the first known computationally tractable algorithm for computing the solution to the Graves-Lai optimization problem, which in turn…
-
$L_1$-norm Regularized Indefinite Kernel Logistic Regression
$L_1$-norm Regularized Indefinite Kernel Logistic Regression arXiv:2510.26043v1 Announce Type: new Abstract: Kernel logistic regression (KLR) is a powerful classification method widely applied across diverse domains. In many real-world scenarios, indefinite kernels capture more domain-specific structural information than positive definite kernels. This paper proposes a novel $L_1$-norm regularized indefinite kernel logistic regression (RIKLR) model, which extends…
-
Conformal Prediction Beyond the Horizon: Distribution-Free Inference for Policy Evaluation
Conformal Prediction Beyond the Horizon: Distribution-Free Inference for Policy Evaluation arXiv:2510.26026v1 Announce Type: new Abstract: Reliable uncertainty quantification is crucial for reinforcement learning (RL) in high-stakes settings. We propose a unified conformal prediction framework for infinite-horizon policy evaluation that constructs distribution-free prediction intervals {for returns} in both on-policy and off-policy settings. Our method integrates distributional…
-
Bias-Corrected Data Synthesis for Imbalanced Learning
Bias-Corrected Data Synthesis for Imbalanced Learning arXiv:2510.26046v1 Announce Type: new Abstract: Imbalanced data, where the positive samples represent only a small proportion compared to the negative samples, makes it challenging for classification problems to balance the false positive and false negative rates. A common approach to addressing the challenge involves generating synthetic data for the…
-
Data-driven Projection Generation for Efficiently Solving Heterogeneous Quadratic Programming Problems
Data-driven Projection Generation for Efficiently Solving Heterogeneous Quadratic Programming Problems arXiv:2510.26061v1 Announce Type: new Abstract: We propose a data-driven framework for efficiently solving quadratic programming (QP) problems by reducing the number of variables in high-dimensional QPs using instance-specific projection. A graph neural network-based model is designed to generate projections tailored to each QP instance, enabling…
-
Certainty in Uncertainty: Reasoning over Uncertain Knowledge Graphs with Statistical Guarantees
Certainty in Uncertainty: Reasoning over Uncertain Knowledge Graphs with Statistical Guarantees arXiv:2510.24754v1 Announce Type: new Abstract: Uncertain knowledge graph embedding (UnKGE) methods learn vector representations that capture both structural and uncertainty information to predict scores of unseen triples. However, existing methods produce only point estimates, without quantifying predictive uncertainty-limiting their reliability in high-stakes applications where…
-
Tree Ensemble Explainability through the Hoeffding Functional Decomposition and TreeHFD Algorithm
Tree Ensemble Explainability through the Hoeffding Functional Decomposition and TreeHFD Algorithm arXiv:2510.24815v1 Announce Type: new Abstract: Tree ensembles have demonstrated state-of-the-art predictive performance across a wide range of problems involving tabular data. Nevertheless, the black-box nature of tree ensembles is a strong limitation, especially for applications with critical decisions at stake. The Hoeffding or ANOVA…
-
Generative Bayesian Optimization: Generative Models as Acquisition Functions
Generative Bayesian Optimization: Generative Models as Acquisition Functions arXiv:2510.25240v1 Announce Type: new Abstract: We present a general strategy for turning generative models into candidate solution samplers for batch Bayesian optimization (BO). The use of generative models for BO enables large batch scaling as generative sampling, optimization of non-continuous design spaces, and high-dimensional and combinatorial design.…
-
Convergence of off-policy TD(0) with linear function approximation for reversible Markov chains
Convergence of off-policy TD(0) with linear function approximation for reversible Markov chains arXiv:2510.25514v1 Announce Type: new Abstract: We study the convergence of off-policy TD(0) with linear function approximation when used to approximate the expected discounted reward in a Markov chain. It is well known that the combination of off-policy learning and function approximation can lead…
-
Using latent representations to link disjoint longitudinal data for mixed-effects regression
Using latent representations to link disjoint longitudinal data for mixed-effects regression arXiv:2510.25531v1 Announce Type: new Abstract: Many rare diseases offer limited established treatment options, leading patients to switch therapies when new medications emerge. To analyze the impact of such treatment switches within the low sample size limitations of rare disease trials, it is important to…
-
Beyond Normality: Reliable A/B Testing with Non-Gaussian Data
Beyond Normality: Reliable A/B Testing with Non-Gaussian Data arXiv:2510.23666v1 Announce Type: new Abstract: A/B testing has become the cornerstone of decision-making in online markets, guiding how platforms launch new features, optimize pricing strategies, and improve user experience. In practice, we typically employ the pairwise $t$-test to compare outcomes between the treatment and control groups, thereby…
-
VIKING: Deep variational inference with stochastic projections
VIKING: Deep variational inference with stochastic projections arXiv:2510.23684v1 Announce Type: new Abstract: Variational mean field approximations tend to struggle with contemporary overparametrized deep neural networks. Where a Bayesian treatment is usually associated with high-quality predictions and uncertainties, the practical reality has been the opposite, with unstable training, poor predictive power, and subpar calibration. Building upon…
-
Understanding Fairness and Prediction Error through Subspace Decomposition and Influence Analysis
Understanding Fairness and Prediction Error through Subspace Decomposition and Influence Analysis arXiv:2510.23935v1 Announce Type: new Abstract: Machine learning models have achieved widespread success but often inherit and amplify historical biases, resulting in unfair outcomes. Traditional fairness methods typically impose constraints at the prediction level, without addressing underlying biases in data representations. In this work, we…
-
Bayesian neural networks with interpretable priors from Mercer kernels
Bayesian neural networks with interpretable priors from Mercer kernels arXiv:2510.23745v1 Announce Type: new Abstract: Quantifying the uncertainty in the output of a neural network is essential for deployment in scientific or engineering applications where decisions must be made under limited or noisy data. Bayesian neural networks (BNNs) provide a framework for this purpose by constructing…
-
Score-based constrained generative modeling via Langevin diffusions with boundary conditions
Score-based constrained generative modeling via Langevin diffusions with boundary conditions arXiv:2510.23985v1 Announce Type: new Abstract: Score-based generative models based on stochastic differential equations (SDEs) achieve impressive performance in sampling from unknown distributions, but often fail to satisfy underlying constraints. We propose a constrained generative model using kinetic (underdamped) Langevin dynamics with specular reflection of velocity…
-
Input Adaptive Bayesian Model Averaging
Input Adaptive Bayesian Model Averaging arXiv:2510.22054v1 Announce Type: new Abstract: This paper studies prediction with multiple candidate models, where the goal is to combine their outputs. This task is especially challenging in heterogeneous settings, where different models may be better suited to different inputs. We propose input adaptive Bayesian Model Averaging (IA-BMA), a Bayesian method…
-
Bridging Prediction and Attribution: Identifying Forward and Backward Causal Influence Ranges Using Assimilative Causal Inference
Bridging Prediction and Attribution: Identifying Forward and Backward Causal Influence Ranges Using Assimilative Causal Inference arXiv:2510.21889v1 Announce Type: new Abstract: Causal inference identifies cause-and-effect relationships between variables. While traditional approaches rely on data to reveal causal links, a recently developed method, assimilative causal inference (ACI), integrates observations with dynamical models. It utilizes Bayesian data assimilation…
-
Differentially Private High-dimensional Variable Selection via Integer Programming
Differentially Private High-dimensional Variable Selection via Integer Programming arXiv:2510.22062v1 Announce Type: new Abstract: Sparse variable selection improves interpretability and generalization in high-dimensional learning by selecting a small subset of informative features. Recent advances in Mixed Integer Programming (MIP) have enabled solving large-scale non-private sparse regression – known as Best Subset Selection (BSS) – with millions…
-
Frequentist Validity of Epistemic Uncertainty Estimators
Frequentist Validity of Epistemic Uncertainty Estimators arXiv:2510.22063v1 Announce Type: new Abstract: Decomposing prediction uncertainty into its aleatoric (irreducible) and epistemic (reducible) components is critical for the development and deployment of machine learning systems. A popular, principled measure for epistemic uncertainty is the mutual information between the response variable and model parameters. However, evaluating this measure…
-
Exponential Convergence Guarantees for Iterative Markovian Fitting
Exponential Convergence Guarantees for Iterative Markovian Fitting arXiv:2510.20871v1 Announce Type: new Abstract: The Schr”odinger Bridge (SB) problem has become a fundamental tool in computational optimal transport and generative modeling. To address this problem, ideal methods such as Iterative Proportional Fitting and Iterative Markovian Fitting (IMF) have been proposed-alongside practical approximations like Diffusion Schr”odinger Bridge and…
-
Kernel Learning with Adversarial Features: Numerical Efficiency and Adaptive Regularization
Kernel Learning with Adversarial Features: Numerical Efficiency and Adaptive Regularization arXiv:2510.20883v1 Announce Type: new Abstract: Adversarial training has emerged as a key technique to enhance model robustness against adversarial input perturbations. Many of the existing methods rely on computationally expensive min-max problems that limit their application in practice. We propose a novel formulation of adversarial…
-
A Short Note on Upper Bounds for Graph Neural Operator Convergence Rate
A Short Note on Upper Bounds for Graph Neural Operator Convergence Rate arXiv:2510.20954v1 Announce Type: new Abstract: Graphons, as limits of graph sequences, provide a framework for analyzing the asymptotic behavior of graph neural operators. Spectral convergence of sampled graphs to graphons yields operator-level convergence rates, enabling transferability analyses of GNNs. This note summarizes known…
-
Enforcing Calibration in Multi-Output Probabilistic Regression with Pre-rank Regularization
Enforcing Calibration in Multi-Output Probabilistic Regression with Pre-rank Regularization arXiv:2510.21273v1 Announce Type: new Abstract: Probabilistic models must be well calibrated to support reliable decision-making. While calibration in single-output regression is well studied, defining and achieving multivariate calibration in multi-output regression remains considerably more challenging. The existing literature on multivariate calibration primarily focuses on diagnostic tools…
-
Doubly-Regressing Approach for Subgroup Fairness
Doubly-Regressing Approach for Subgroup Fairness arXiv:2510.21091v1 Announce Type: new Abstract: Algorithmic fairness is a socially crucial topic in real-world applications of AI. Among many notions of fairness, subgroup fairness is widely studied when multiple sensitive attributes (e.g., gender, race, age) are present. However, as the number of sensitive attributes grows, the number of subgroups increases…
-
Compositional Generation for Long-Horizon Coupled PDEs
Compositional Generation for Long-Horizon Coupled PDEs arXiv:2510.20141v1 Announce Type: new Abstract: Simulating coupled PDE systems is computationally intensive, and prior efforts have largely focused on training surrogates on the joint (coupled) data, which requires a large amount of data. In the paper, we study compositional diffusion approaches where diffusion models are only trained on the…
-
Enhanced Cyclic Coordinate Descent Methods for Elastic Net Penalized Linear Models
Enhanced Cyclic Coordinate Descent Methods for Elastic Net Penalized Linear Models arXiv:2510.19999v1 Announce Type: new Abstract: We present a novel enhanced cyclic coordinate descent (ECCD) framework for solving generalized linear models with elastic net constraints that reduces training time in comparison to existing state-of-the-art methods. We redesign the CD method by performing a Taylor expansion…
-
Neural Networks for Censored Expectile Regression Based on Data Augmentation
Neural Networks for Censored Expectile Regression Based on Data Augmentation arXiv:2510.20344v1 Announce Type: new Abstract: Expectile regression neural networks (ERNNs) are powerful tools for capturing heterogeneity and complex nonlinear structures in data. However, most existing research has primarily focused on fully observed data, with limited attention paid to scenarios involving censored observations. In this paper,…
-
Testing Most Influential Sets
Testing Most Influential Sets arXiv:2510.20372v1 Announce Type: new Abstract: Small subsets of data with disproportionate influence on model outcomes can have dramatic impacts on conclusions, with a few data points sometimes overturning key findings. While recent work has developed methods to identify these emph{most influential sets}, no formal theory exists to determine when their influence…
-
Learning Decentralized Routing Policies via Graph Attention-based Multi-Agent Reinforcement Learning in Lunar Delay-Tolerant Networks
Learning Decentralized Routing Policies via Graph Attention-based Multi-Agent Reinforcement Learning in Lunar Delay-Tolerant Networks arXiv:2510.20436v1 Announce Type: new Abstract: We present a fully decentralized routing framework for multi-robot exploration missions operating under the constraints of a Lunar Delay-Tolerant Network (LDTN). In this setting, autonomous rovers must relay collected data to a lander under intermittent connectivity…
-
Signature Kernel Scoring Rule as Spatio-Temporal Diagnostic for Probabilistic Forecasting
Signature Kernel Scoring Rule as Spatio-Temporal Diagnostic for Probabilistic Forecasting arXiv:2510.19110v1 Announce Type: new Abstract: Modern weather forecasting has increasingly transitioned from numerical weather prediction (NWP) to data-driven machine learning forecasting techniques. While these new models produce probabilistic forecasts to quantify uncertainty, their training and evaluation may remain hindered by conventional scoring rules, primarily MSE,…
-
Calibrated Principal Component Regression
Calibrated Principal Component Regression arXiv:2510.19020v1 Announce Type: new Abstract: We propose a new method for statistical inference in generalized linear models. In the overparameterized regime, Principal Component Regression (PCR) reduces variance by projecting high-dimensional data to a low-dimensional principal subspace before fitting. However, PCR incurs truncation bias whenever the true regression vector has mass outside…
-
Extreme Event Aware ($eta$-) Learning
Extreme Event Aware ($eta$-) Learning arXiv:2510.19161v1 Announce Type: new Abstract: Quantifying and predicting rare and extreme events persists as a crucial yet challenging task in understanding complex dynamical systems. Many practical challenges arise from the infrequency and severity of these events, including the considerable variance of simple sampling methods and the substantial computational cost of…
-
Topology of Currencies: Persistent Homology for FX Co-movements: A Comparative Clustering Study
Topology of Currencies: Persistent Homology for FX Co-movements: A Comparative Clustering Study arXiv:2510.19306v1 Announce Type: new Abstract: This study investigates whether Topological Data Analysis (TDA) can provide additional insights beyond traditional statistical methods in clustering currency behaviours. We focus on the foreign exchange (FX) market, which is a complex system often exhibiting non-linear and high-dimensional…
-
Graphical model for tensor factorization by sparse sampling
Graphical model for tensor factorization by sparse sampling arXiv:2510.17886v1 Announce Type: new Abstract: We consider tensor factorizations based on sparse measurements of the tensor components. The measurements are designed in a way that the underlying graph of interactions is a random graph. The setup will be useful in cases where a substantial amount of data…
-
Learning Time-Varying Graphs from Incomplete Graph Signals
Learning Time-Varying Graphs from Incomplete Graph Signals arXiv:2510.17903v1 Announce Type: new Abstract: This paper tackles the challenging problem of jointly inferring time-varying network topologies and imputing missing data from partially observed graph signals. We propose a unified non-convex optimization framework to simultaneously recover a sequence of graph Laplacian matrices while reconstructing the unobserved signal entries.…
-
Generalization Below the Edge of Stability: The Role of Data Geometry
Generalization Below the Edge of Stability: The Role of Data Geometry arXiv:2510.18120v1 Announce Type: new Abstract: Understanding generalization in overparameterized neural networks hinges on the interplay between the data geometry, neural architecture, and training dynamics. In this paper, we theoretically explore how data geometry controls this implicit bias. This paper presents theoretical results for overparameterized…
-
Arbitrated Indirect Treatment Comparisons
Arbitrated Indirect Treatment Comparisons arXiv:2510.18071v1 Announce Type: new Abstract: Matching-adjusted indirect comparison (MAIC) has been increasingly employed in health technology assessments (HTA). By reweighting subjects from a trial with individual participant data (IPD) to match the covariate summary statistics of another trial with only aggregate data (AgD), MAIC facilitates the estimation of a treatment effect…
-
Beating the Winner’s Curse via Inference-Aware Policy Optimization
Beating the Winner’s Curse via Inference-Aware Policy Optimization arXiv:2510.18161v1 Announce Type: new Abstract: There has been a surge of recent interest in automatically learning policies to target treatment decisions based on rich individual covariates. A common approach is to train a machine learning model to predict counterfactual outcomes, and then select the policy that optimizes…
-
Learning density ratios in causal inference using Bregman-Riesz regression
Learning density ratios in causal inference using Bregman-Riesz regression arXiv:2510.16127v1 Announce Type: new Abstract: The ratio of two probability density functions is a fundamental quantity that appears in many areas of statistics and machine learning, including causal inference, reinforcement learning, covariate shift, outlier detection, independence testing, importance sampling, and diffusion modeling. Naively estimating the numerator…
-
A Relative Error-Based Evaluation Framework of Heterogeneous Treatment Effect Estimators
A Relative Error-Based Evaluation Framework of Heterogeneous Treatment Effect Estimators arXiv:2510.16419v1 Announce Type: new Abstract: While significant progress has been made in heterogeneous treatment effect (HTE) estimation, the evaluation of HTE estimators remains underdeveloped. In this article, we propose a robust evaluation framework based on relative error, which quantifies performance differences between two HTE estimators.…
-
Personalized Collaborative Learning with Affinity-Based Variance Reduction
Personalized Collaborative Learning with Affinity-Based Variance Reduction arXiv:2510.16232v1 Announce Type: new Abstract: Multi-agent learning faces a fundamental tension: leveraging distributed collaboration without sacrificing the personalization needed for diverse agents. This tension intensifies when aiming for full personalization while adapting to unknown heterogeneity levels — gaining collaborative speedup when agents are similar, without performance degradation when…
-
A Bayesian Framework for Symmetry Inference in Chaotic Attractors
A Bayesian Framework for Symmetry Inference in Chaotic Attractors arXiv:2510.16509v1 Announce Type: new Abstract: Detecting symmetry from data is a fundamental problem in signal analysis, providing insight into underlying structure and constraints. When data emerge as trajectories of dynamical systems, symmetries encode structural properties of the dynamics that enable model reduction, principled comparison across conditions,…
-
From Reviews to Actionable Insights: An LLM-Based Approach for Attribute and Feature Extraction
From Reviews to Actionable Insights: An LLM-Based Approach for Attribute and Feature Extraction arXiv:2510.16551v1 Announce Type: new Abstract: This research proposes a systematic, large language model (LLM) approach for extracting product and service attributes, features, and associated sentiments from customer reviews. Grounded in marketing theory, the framework distinguishes perceptual attributes from actionable features, producing interpretable…
-
From Universal Approximation Theorem to Tropical Geometry of Multi-Layer Perceptrons
From Universal Approximation Theorem to Tropical Geometry of Multi-Layer Perceptrons arXiv:2510.15012v1 Announce Type: new Abstract: We revisit the Universal Approximation Theorem(UAT) through the lens of the tropical geometry of neural networks and introduce a constructive, geometry-aware initialization for sigmoidal multi-layer perceptrons (MLPs). Tropical geometry shows that Rectified Linear Unit (ReLU) networks admit decision functions with…
-
Reliable data clustering with Bayesian community detection
Reliable data clustering with Bayesian community detection arXiv:2510.15013v1 Announce Type: new Abstract: From neuroscience and genomics to systems biology and ecology, researchers rely on clustering similarity data to uncover modular structure. Yet widely used clustering methods, such as hierarchical clustering, k-means, and WGCNA, lack principled model selection, leaving them susceptible to noise. A common workaround…
-
The Coverage Principle: How Pre-training Enables Post-Training
The Coverage Principle: How Pre-training Enables Post-Training arXiv:2510.15020v1 Announce Type: new Abstract: Language models demonstrate remarkable abilities when pre-trained on large text corpora and fine-tuned for specific tasks, but how and why pre-training shapes the success of the final model remains poorly understood. Notably, although pre-training success is often quantified by cross entropy loss, cross-entropy…
-
The Tree-SNE Tree Exists
The Tree-SNE Tree Exists arXiv:2510.15014v1 Announce Type: new Abstract: The clustering and visualisation of high-dimensional data is a ubiquitous task in modern data science. Popular techniques include nonlinear dimensionality reduction methods like t-SNE or UMAP. These methods face the `scale-problem’ of clustering: when dealing with the MNIST dataset, do we want to distinguish different digits…
-
The Minimax Lower Bound of Kernel Stein Discrepancy Estimation
The Minimax Lower Bound of Kernel Stein Discrepancy Estimation arXiv:2510.15058v1 Announce Type: new Abstract: Kernel Stein discrepancies (KSDs) have emerged as a powerful tool for quantifying goodness-of-fit over the last decade, featuring numerous successful applications. To the best of our knowledge, all existing KSD estimators with known rate achieve $sqrt n$-convergence. In this work, we…
-
Exact Dynamics of Multi-class Stochastic Gradient Descent
Exact Dynamics of Multi-class Stochastic Gradient Descent arXiv:2510.14074v1 Announce Type: new Abstract: We develop a framework for analyzing the training and learning rate dynamics on a variety of high- dimensional optimization problems trained using one-pass stochastic gradient descent (SGD) with data generated from multiple anisotropic classes. We give exact expressions for a large class of…
-
deFOREST: Fusing Optical and Radar satellite data for Enhanced Sensing of Tree-loss
deFOREST: Fusing Optical and Radar satellite data for Enhanced Sensing of Tree-loss arXiv:2510.14092v1 Announce Type: new Abstract: In this paper we develop a deforestation detection pipeline that incorporates optical and Synthetic Aperture Radar (SAR) data. A crucial component of the pipeline is the construction of anomaly maps of the optical data, which is done using…
-
High-Dimensional BWDM: A Robust Nonparametric Clustering Validation Index for Large-Scale Data
High-Dimensional BWDM: A Robust Nonparametric Clustering Validation Index for Large-Scale Data arXiv:2510.14145v1 Announce Type: new Abstract: Determining the appropriate number of clusters in unsupervised learning is a central problem in statistics and data science. Traditional validity indices such as Calinski-Harabasz, Silhouette, and Davies-Bouldin-depend on centroid-based distances and therefore degrade in high-dimensional or contaminated data. This…
-
Personalized federated learning, Row-wise fusion regularization, Multivariate modeling, Sparse estimation
Personalized federated learning, Row-wise fusion regularization, Multivariate modeling, Sparse estimation arXiv:2510.14413v1 Announce Type: new Abstract: We study personalized federated learning for multivariate responses where client models are heterogeneous yet share variable-level structure. Existing entry-wise penalties ignore cross-response dependence, while matrix-wise fusion over-couples clients. We propose a Sparse Row-wise Fusion (SROF) regularizer that clusters row vectors…
-
A novel Information-Driven Strategy for Optimal Regression Assessment
A novel Information-Driven Strategy for Optimal Regression Assessment arXiv:2510.14222v1 Announce Type: new Abstract: In Machine Learning (ML), a regression algorithm aims to minimize a loss function based on data. An assessment method in this context seeks to quantify the discrepancy between the optimal response for an input-output system and the estimate produced by a learned…
-
Efficient Inference for Coupled Hidden Markov Models in Continuous Time and Discrete Space
Efficient Inference for Coupled Hidden Markov Models in Continuous Time and Discrete Space arXiv:2510.12916v1 Announce Type: new Abstract: Systems of interacting continuous-time Markov chains are a powerful model class, but inference is typically intractable in high dimensional settings. Auxiliary information, such as noisy observations, is typically only available at discrete times, and incorporating it via…
-
Simplicial Gaussian Models: Representation and Inference
Simplicial Gaussian Models: Representation and Inference arXiv:2510.12983v1 Announce Type: new Abstract: Probabilistic graphical models (PGMs) are powerful tools for representing statistical dependencies through graphs in high-dimensional systems. However, they are limited to pairwise interactions. In this work, we propose the simplicial Gaussian model (SGM), which extends Gaussian PGM to simplicial complexes. SGM jointly models random…
-
Conformal Inference for Open-Set and Imbalanced Classification
Conformal Inference for Open-Set and Imbalanced Classification arXiv:2510.13037v1 Announce Type: new Abstract: This paper presents a conformal prediction method for classification in highly imbalanced and open-set settings, where there are many possible classes and not all may be represented in the data. Existing approaches require a finite, known label space and typically involve random sample…
-
A Multi-dimensional Semantic Surprise Framework Based on Low-Entropy Semantic Manifolds for Fine-Grained Out-of-Distribution Detection
A Multi-dimensional Semantic Surprise Framework Based on Low-Entropy Semantic Manifolds for Fine-Grained Out-of-Distribution Detection arXiv:2510.13093v1 Announce Type: new Abstract: Out-of-Distribution (OOD) detection is a cornerstone for the safe deployment of AI systems in the open world. However, existing methods treat OOD detection as a binary classification problem, a cognitive flattening that fails to distinguish between…
-
Gaussian Certified Unlearning in High Dimensions: A Hypothesis Testing Approach
Gaussian Certified Unlearning in High Dimensions: A Hypothesis Testing Approach arXiv:2510.13094v1 Announce Type: new Abstract: Machine unlearning seeks to efficiently remove the influence of selected data while preserving generalization. Significant progress has been made in low dimensions $(p ll n)$, but high dimensions pose serious theoretical challenges as standard optimization assumptions of $Omega(1)$ strong convexity…
-
Dimension-Free Minimax Rates for Learning Pairwise Interactions in Attention-Style Models
Dimension-Free Minimax Rates for Learning Pairwise Interactions in Attention-Style Models arXiv:2510.11789v1 Announce Type: new Abstract: We study the convergence rate of learning pairwise interactions in single-layer attention-style models, where tokens interact through a weight matrix and a non-linear activation function. We prove that the minimax rate is $M^{-frac{2beta}{2beta+1}}$ with $M$ being the sample size, depending…
-
On Thompson Sampling and Bilateral Uncertainty in Additive Bayesian Optimization
On Thompson Sampling and Bilateral Uncertainty in Additive Bayesian Optimization arXiv:2510.11792v1 Announce Type: new Abstract: In Bayesian Optimization (BO), additive assumptions can mitigate the twin difficulties of modeling and searching a complex function in high dimension. However, common acquisition functions, like the Additive Lower Confidence Bound, ignore pairwise covariances between dimensions, which we’ll call textit{bilateral…
-
Active Subspaces in Infinite Dimension
Active Subspaces in Infinite Dimension arXiv:2510.11871v1 Announce Type: new Abstract: Active subspace analysis uses the leading eigenspace of the gradient’s second moment to conduct supervised dimension reduction. In this article, we extend this methodology to real-valued functionals on Hilbert space. We define an operator which coincides with the active subspace matrix when applied to a…
-
High-Probability Bounds For Heterogeneous Local Differential Privacy
High-Probability Bounds For Heterogeneous Local Differential Privacy arXiv:2510.11895v1 Announce Type: new Abstract: We study statistical estimation under local differential privacy (LDP) when users may hold heterogeneous privacy levels and accuracy must be guaranteed with high probability. Departing from the common in-expectation analyses, and for one-dimensional and multi-dimensional mean estimation problems, we develop finite sample upper…
-
Simplifying Optimal Transport through Schatten-$p$ Regularization
Simplifying Optimal Transport through Schatten-$p$ Regularization arXiv:2510.11910v1 Announce Type: new Abstract: We propose a new general framework for recovering low-rank structure in optimal transport using Schatten-$p$ norm regularization. Our approach extends existing methods that promote sparse and interpretable transport maps or plans, while providing a unified and principled family of convex programs that encourage low-dimensional…
-
Learning with Incomplete Context: Linear Contextual Bandits with Pretrained Imputation
Learning with Incomplete Context: Linear Contextual Bandits with Pretrained Imputation arXiv:2510.09908v1 Announce Type: new Abstract: The rise of large-scale pretrained models has made it feasible to generate predictive or synthetic features at low cost, raising the question of how to incorporate such surrogate predictions into downstream decision-making. We study this problem in the setting of…
-
Calibrating Generative Models
Calibrating Generative Models arXiv:2510.10020v1 Announce Type: new Abstract: Generative models frequently suffer miscalibration, wherein class probabilities and other statistics of the sampling distribution deviate from desired values. We frame calibration as a constrained optimization problem and seek the closest model in Kullback-Leibler divergence satisfying calibration constraints. To address the intractability of imposing these constraints exactly,…
-
Kernel Treatment Effects with Adaptively Collected Data
Kernel Treatment Effects with Adaptively Collected Data arXiv:2510.10245v1 Announce Type: new Abstract: Adaptive experiments improve efficiency by adjusting treatment assignments based on past outcomes, but this adaptivity breaks the i.i.d. assumptions that underpins classical asymptotics. At the same time, many questions of interest are distributional, extending beyond average effects. Kernel treatment effects (KTE) provide a…
-
Neural variational inference for cutting feedback during uncertainty propagation
Neural variational inference for cutting feedback during uncertainty propagation arXiv:2510.10268v1 Announce Type: new Abstract: In many scientific applications, uncertainty of estimates from an earlier (upstream) analysis needs to be propagated in subsequent (downstream) Bayesian analysis, without feedback. Cutting feedback methods, also termed cut-Bayes, achieve this by constructing a cut-posterior distribution that prevents backward information flow.…
-
On some practical challenges of conformal prediction
On some practical challenges of conformal prediction arXiv:2510.10324v1 Announce Type: new Abstract: Conformal prediction is a model-free machine learning method for creating prediction regions with a guaranteed coverage probability level. However, a data scientist often faces three challenges in practice: (i) the determination of a conformal prediction region is only approximate, jeopardizing the finite-sample validity…
-
A Representer Theorem for Hawkes Processes via Penalized Least Squares Minimization
A Representer Theorem for Hawkes Processes via Penalized Least Squares Minimization arXiv:2510.08916v1 Announce Type: new Abstract: The representer theorem is a cornerstone of kernel methods, which aim to estimate latent functions in reproducing kernel Hilbert spaces (RKHSs) in a nonparametric manner. Its significance lies in converting inherently infinite-dimensional optimization problems into finite-dimensional ones over dual…
-
Gradient-Guided Furthest Point Sampling for Robust Training Set Selection
Gradient-Guided Furthest Point Sampling for Robust Training Set Selection arXiv:2510.08906v1 Announce Type: new Abstract: Smart training set selections procedures enable the reduction of data needs and improves predictive robustness in machine learning problems relevant to chemistry. We introduce Gradient Guided Furthest Point Sampling (GGFPS), a simple extension of Furthest Point Sampling (FPS) that leverages molecular…
-
Mirror Flow Matching with Heavy-Tailed Priors for Generative Modeling on Convex Domains
Mirror Flow Matching with Heavy-Tailed Priors for Generative Modeling on Convex Domains arXiv:2510.08929v1 Announce Type: new Abstract: We study generative modeling on convex domains using flow matching and mirror maps, and identify two fundamental challenges. First, standard log-barrier mirror maps induce heavy-tailed dual distributions, leading to ill-posed dynamics. Second, coupling with Gaussian priors performs poorly…
-
Distributionally robust approximation property of neural networks
Distributionally robust approximation property of neural networks arXiv:2510.09177v1 Announce Type: new Abstract: The universal approximation property uniformly with respect to weakly compact families of measures is established for several classes of neural networks. To that end, we prove that these neural networks are dense in Orlicz spaces, thereby extending classical universal approximation theorems even beyond…
-
A unified Bayesian framework for adversarial robustness
A unified Bayesian framework for adversarial robustness arXiv:2510.09288v1 Announce Type: new Abstract: The vulnerability of machine learning models to adversarial attacks remains a critical security challenge. Traditional defenses, such as adversarial training, typically robustify models by minimizing a worst-case loss. However, these deterministic approaches do not account for uncertainty in the adversary’s attack. While stochastic…