Category: stat.ML

  • kNNSampler: Stochastic Imputations for Recovering Missing Value Distributions

    kNNSampler: Stochastic Imputations for Recovering Missing Value Distributions arXiv:2509.08366v1 Announce Type: new Abstract: We study a missing-value imputation method, termed kNNSampler, that imputes a given unit’s missing response by randomly sampling from the observed responses of the $k$ most similar units to the given unit in terms of the observed covariates. This method can sample…

  • Gaussian Process Regression — Neural Network Hybrid with Optimized Redundant Coordinates

    Gaussian Process Regression — Neural Network Hybrid with Optimized Redundant Coordinates arXiv:2509.08457v1 Announce Type: new Abstract: Recently, a Gaussian Process Regression – neural network (GPRNN) hybrid machine learning method was proposed, which is based on additive-kernel GPR in redundant coordinates constructed by rules [J. Phys. Chem. A 127 (2023) 7823]. The method combined the expressive…

  • PEHRT: A Common Pipeline for Harmonizing Electronic Health Record data for Translational Research

    PEHRT: A Common Pipeline for Harmonizing Electronic Health Record data for Translational Research arXiv:2509.08553v1 Announce Type: new Abstract: Integrative analysis of multi-institutional Electronic Health Record (EHR) data enhances the reliability and generalizability of translational research by leveraging larger, more diverse patient cohorts and incorporating multiple data modalities. However, harmonizing EHR data across institutions poses major…

  • Machine Learning with Multitype Protected Attributes: Intersectional Fairness through Regularisation

    Machine Learning with Multitype Protected Attributes: Intersectional Fairness through Regularisation arXiv:2509.08163v1 Announce Type: cross Abstract: Ensuring equitable treatment (fairness) across protected attributes (such as gender or ethnicity) is a critical issue in machine learning. Most existing literature focuses on binary classification, but achieving fairness in regression tasks-such as insurance pricing or hiring score assessments-is equally…

  • A hierarchical entropy method for the delocalization of bias in high-dimensional Langevin Monte Carlo

    A hierarchical entropy method for the delocalization of bias in high-dimensional Langevin Monte Carlo arXiv:2509.08619v1 Announce Type: new Abstract: The unadjusted Langevin algorithm is widely used for sampling from complex high-dimensional distributions. It is well known to be biased, with the bias typically scaling linearly with the dimension when measured in squared Wasserstein distance. However,…

  • NestGNN: A Graph Neural Network Framework Generalizing the Nested Logit Model for Travel Mode Choice

    NestGNN: A Graph Neural Network Framework Generalizing the Nested Logit Model for Travel Mode Choice arXiv:2509.07123v1 Announce Type: new Abstract: Nested logit (NL) has been commonly used for discrete choice analysis, including a wide range of applications such as travel mode choice, automobile ownership, or location decisions. However, the classical NL models are restricted by…

  • ADHAM: Additive Deep Hazard Analysis Mixtures for Interpretable Survival Regression

    ADHAM: Additive Deep Hazard Analysis Mixtures for Interpretable Survival Regression arXiv:2509.07108v1 Announce Type: new Abstract: Survival analysis is a fundamental tool for modeling time-to-event outcomes in healthcare. Recent advances have introduced flexible neural network approaches for improved predictive performance. However, most of these models do not provide interpretable insights into the association between exposures and…

  • Kernel VICReg for Self-Supervised Learning in Reproducing Kernel Hilbert Space

    Kernel VICReg for Self-Supervised Learning in Reproducing Kernel Hilbert Space arXiv:2509.07289v1 Announce Type: new Abstract: Self-supervised learning (SSL) has emerged as a powerful paradigm for representation learning by optimizing geometric objectives–such as invariance to augmentations, variance preservation, and feature decorrelation–without requiring labels. However, most existing methods operate in Euclidean space, limiting their ability to capture…

  • Identifying Neural Signatures from fMRI using Hybrid Principal Components Regression

    Identifying Neural Signatures from fMRI using Hybrid Principal Components Regression arXiv:2509.07300v1 Announce Type: new Abstract: Recent advances in neuroimaging analysis have enabled accurate decoding of mental state from brain activation patterns during functional magnetic resonance imaging scans. A commonly applied tool for this purpose is principal components regression regularized with the least absolute shrinkage and…

  • Asynchronous Gossip Algorithms for Rank-Based Statistical Methods

    Asynchronous Gossip Algorithms for Rank-Based Statistical Methods arXiv:2509.07543v1 Announce Type: new Abstract: As decentralized AI and edge intelligence become increasingly prevalent, ensuring robustness and trustworthiness in such distributed settings has become a critical issue-especially in the presence of corrupted or adversarial data. Traditional decentralized algorithms are vulnerable to data contamination as they typically rely on…

  • Cryo-EM as a Stochastic Inverse Problem

    Cryo-EM as a Stochastic Inverse Problem arXiv:2509.05541v1 Announce Type: new Abstract: Cryo-electron microscopy (Cryo-EM) enables high-resolution imaging of biomolecules, but structural heterogeneity remains a major challenge in 3D reconstruction. Traditional methods assume a discrete set of conformations, limiting their ability to recover continuous structural variability. In this work, we formulate cryo-EM reconstruction as a stochastic…

  • Robust variational neural posterior estimation for simulation-based inference

    Robust variational neural posterior estimation for simulation-based inference arXiv:2509.05724v1 Announce Type: new Abstract: Recent advances in neural density estimation have enabled powerful simulation-based inference (SBI) methods that can flexibly approximate Bayesian inference for intractable stochastic models. Although these methods have demonstrated reliable posterior estimation when the simulator accurately represents the underlying data generative process (GDP),…

  • Risk-averse Fair Multi-class Classification

    Risk-averse Fair Multi-class Classification arXiv:2509.05771v1 Announce Type: new Abstract: We develop a new classification framework based on the theory of coherent risk measures and systemic risk. The proposed approach is suitable for multi-class problems when the data is noisy, scarce (relative to the dimension of the problem), and the labeling might be unreliable. In the…

  • Fisher Random Walk: Automatic Debiasing Contextual Preference Inference for Large Language Model Evaluation

    Fisher Random Walk: Automatic Debiasing Contextual Preference Inference for Large Language Model Evaluation arXiv:2509.05852v1 Announce Type: new Abstract: Motivated by the need for rigorous and scalable evaluation of large language models, we study contextual preference inference for pairwise comparison functionals of context-dependent preference score functions across domains. Focusing on the contextual Bradley-Terry-Luce model, we develop…

  • Causal Clustering for Conditional Average Treatment Effects Estimation and Subgroup Discovery

    Causal Clustering for Conditional Average Treatment Effects Estimation and Subgroup Discovery arXiv:2509.05775v1 Announce Type: new Abstract: Estimating heterogeneous treatment effects is critical in domains such as personalized medicine, resource allocation, and policy evaluation. A central challenge lies in identifying subpopulations that respond differently to interventions, thereby enabling more targeted and effective decision-making. While clustering methods…

  • Any-Step Density Ratio Estimation via Interval-Annealed Secant Alignment

    Any-Step Density Ratio Estimation via Interval-Annealed Secant Alignment arXiv:2509.04852v1 Announce Type: new Abstract: Estimating density ratios is a fundamental problem in machine learning, but existing methods often trade off accuracy for efficiency. We propose textit{Interval-annealed Secant Alignment Density Ratio Estimation (ISA-DRE)}, a framework that enables accurate, any-step estimation without numerical integration. Instead of modeling infinitesimal…

  • Optimal Variance and Covariance Estimation under Differential Privacy in the Add-Remove Model and Beyond

    Optimal Variance and Covariance Estimation under Differential Privacy in the Add-Remove Model and Beyond arXiv:2509.04919v1 Announce Type: new Abstract: In this paper, we study the problem of estimating the variance and covariance of datasets under differential privacy in the add-remove model. While estimation in the swap model has been extensively studied in the literature, the…

  • Probabilistic operator learning: generative modeling and uncertainty quantification for foundation models of differential equations

    Probabilistic operator learning: generative modeling and uncertainty quantification for foundation models of differential equations arXiv:2509.05186v1 Announce Type: new Abstract: In-context operator networks (ICON) are a class of operator learning methods based on the novel architectures of foundation models. Trained on a diverse set of datasets of initial and boundary conditions paired with corresponding solutions to…

  • Spectral Algorithms in Misspecified Regression: Convergence under Covariate Shift

    Spectral Algorithms in Misspecified Regression: Convergence under Covariate Shift arXiv:2509.05106v1 Announce Type: new Abstract: This paper investigates the convergence properties of spectral algorithms — a class of regularization methods originating from inverse problems — under covariate shift. In this setting, the marginal distributions of inputs differ between source and target domains, while the conditional distribution…

  • Fundamental bounds on efficiency-confidence trade-off for transductive conformal prediction

    Fundamental bounds on efficiency-confidence trade-off for transductive conformal prediction arXiv:2509.04631v1 Announce Type: cross Abstract: Transductive conformal prediction addresses the simultaneous prediction for multiple data points. Given a desired confidence level, the objective is to construct a prediction set that includes the true outcomes with the prescribed confidence. We demonstrate a fundamental trade-off between confidence and…

  • Energy-Weighted Flow Matching: Unlocking Continuous Normalizing Flows for Efficient and Scalable Boltzmann Sampling

    Energy-Weighted Flow Matching: Unlocking Continuous Normalizing Flows for Efficient and Scalable Boltzmann Sampling arXiv:2509.03726v1 Announce Type: new Abstract: Sampling from unnormalized target distributions, e.g. Boltzmann distributions $mu_{text{target}}(x) propto exp(-E(x)/T)$, is fundamental to many scientific applications yet computationally challenging due to complex, high-dimensional energy landscapes. Existing approaches applying modern generative models to Boltzmann distributions either require…

  • Testing for correlation between network structure and high-dimensional node covariates

    Testing for correlation between network structure and high-dimensional node covariates arXiv:2509.03772v1 Announce Type: new Abstract: In many application domains, networks are observed with node-level features. In such settings, a common problem is to assess whether or not nodal covariates are correlated with the network structure itself. Here, we present four novel methods for addressing this…

  • Diffusion Generative Models Meet Compressed Sensing, with Applications to Image Data and Financial Time Series

    Diffusion Generative Models Meet Compressed Sensing, with Applications to Image Data and Financial Time Series arXiv:2509.03898v1 Announce Type: new Abstract: This paper develops dimension reduction techniques for accelerating diffusion model inference in the context of synthetic data generation. The idea is to integrate compressed sensing into diffusion models: (i) compress the data into a latent…

  • Batched Stochastic Matching Bandits

    Batched Stochastic Matching Bandits arXiv:2509.04194v1 Announce Type: new Abstract: In this study, we introduce a novel bandit framework for stochastic matching based on the Multi-nomial Logit (MNL) choice model. In our setting, $N$ agents on one side are assigned to $K$ arms on the other side, where each arm stochastically selects an agent from its…

  • An invertible generative model for forward and inverse problems

    An invertible generative model for forward and inverse problems arXiv:2509.03910v1 Announce Type: new Abstract: We formulate the inverse problem in a Bayesian framework and aim to train a generative model that allows us to simulate (i.e., sample from the likelihood) and do inference (i.e., sample from the posterior). We review the use of triangular normalizing…

  • Fast kernel methods: Sobolev, physics-informed, and additive models

    Fast kernel methods: Sobolev, physics-informed, and additive models arXiv:2509.02649v1 Announce Type: new Abstract: Kernel methods are powerful tools in statistical learning, but their cubic complexity in the sample size n limits their use on large-scale datasets. In this work, we introduce a scalable framework for kernel regression with O(n log n) complexity, fully leveraging GPU…

  • Gaussian process surrogate with physical law-corrected prior for multi-coupled PDEs defined on irregular geometry

    Gaussian process surrogate with physical law-corrected prior for multi-coupled PDEs defined on irregular geometry arXiv:2509.02617v1 Announce Type: new Abstract: Parametric partial differential equations (PDEs) are fundamental mathematical tools for modeling complex physical systems, yet their numerical evaluation across parameter spaces remains computationally intensive when using conventional high-fidelity solvers. To address this challenge, we propose a…

  • Scale-Adaptive Generative Flows for Multiscale Scientific Data

    Scale-Adaptive Generative Flows for Multiscale Scientific Data arXiv:2509.02971v1 Announce Type: new Abstract: Flow-based generative models can face significant challenges when modeling scientific data with multiscale Fourier spectra, often producing large errors in fine-scale features. We address this problem within the framework of stochastic interpolants, via principled design of noise distributions and interpolation schedules. The key…

  • Bayesian Additive Regression Trees for functional ANOVA model

    Bayesian Additive Regression Trees for functional ANOVA model arXiv:2509.03317v1 Announce Type: new Abstract: Bayesian Additive Regression Trees (BART) is a powerful statistical model that leverages the strengths of Bayesian inference and regression trees. It has received significant attention for capturing complex non-linear relationships and interactions among predictors. However, the accuracy of BART often comes at…

  • Understanding and Improving the Shampoo Optimizer via Kullback-Leibler Minimization

    Understanding and Improving the Shampoo Optimizer via Kullback-Leibler Minimization arXiv:2509.03378v1 Announce Type: new Abstract: As an adaptive method, Shampoo employs a structured second-moment estimation, and its effectiveness has attracted growing attention. Prior work has primarily analyzed its estimation scheme through the Frobenius norm. Motivated by the natural connection between the second moment and a covariance…

  • Simulation-based inference of yeast centromeres

    Simulation-based inference of yeast centromeres arXiv:2509.00200v1 Announce Type: new Abstract: The chromatin folding and the spatial arrangement of chromosomes in the cell play a crucial role in DNA replication and genes expression. An improper chromatin folding could lead to malfunctions and, over time, diseases. For eukaryotes, centromeres are essential for proper chromosome segregation and folding.…

  • Assessing One-Dimensional Cluster Stability by Extreme-Point Trimming

    Assessing One-Dimensional Cluster Stability by Extreme-Point Trimming arXiv:2509.00258v1 Announce Type: new Abstract: We develop a probabilistic method for assessing the tail behavior and geometric stability of one-dimensional n i.i.d. samples by tracking how their span contracts when the most extreme points are trimmed. Central to our approach is the diameter-shrinkage ratio, that quantifies the relative…

  • Probit Monotone BART

    Probit Monotone BART arXiv:2509.00263v1 Announce Type: new Abstract: Bayesian Additive Regression Trees (BART) of Chipman et al. (2010) has proven to be a powerful tool for nonparametric modeling and prediction. Monotone BART (Chipman et al., 2022) is a recent development that allows BART to be more precise in estimating monotonic functions. We further these developments…

  • The Nondecreasing Rank

    The Nondecreasing Rank arXiv:2509.00265v1 Announce Type: new Abstract: In this article the notion of the nondecreasing (ND) rank of a matrix or tensor is introduced. A tensor has an ND rank of r if it can be represented as a sum of r outer products of vectors, with each vector satisfying a monotonicity constraint. It…

  • Partial Functional Dynamic Backdoor Diffusion-based Causal Model

    Partial Functional Dynamic Backdoor Diffusion-based Causal Model arXiv:2509.00472v1 Announce Type: new Abstract: We introduce a Partial Functional Dynamic Backdoor Diffusion-based Causal Model (PFD-BDCM), specifically designed for causal inference in the presence of unmeasured confounders with spatial heterogeneity and temporal dependency. The proposed PFD-BDCM framework addresses the restrictions of the existing approaches by uniquely integrating models…

  • Quantum-inspired probability metrics define a complete, universal space for statistical learning

    Quantum-inspired probability metrics define a complete, universal space for statistical learning arXiv:2508.21086v1 Announce Type: new Abstract: Comparing probability distributions is a core challenge across the natural, social, and computational sciences. Existing methods, such as Maximum Mean Discrepancy (MMD), struggle in high-dimensional and non-compact domains. Here we introduce quantum probability metrics (QPMs), derived by embedding probability…

  • Weighted Support Points from Random Measures: An Interpretable Alternative for Generative Modeling

    Weighted Support Points from Random Measures: An Interpretable Alternative for Generative Modeling arXiv:2508.21255v1 Announce Type: new Abstract: Support points summarize a large dataset through a smaller set of representative points that can be used for data operations, such as Monte Carlo integration, without requiring access to the full dataset. In this sense, support points offer…

  • Adaptive generative moment matching networks for improved learning of dependence structures

    Adaptive generative moment matching networks for improved learning of dependence structures arXiv:2508.21531v1 Announce Type: new Abstract: An adaptive bandwidth selection procedure for the mixture kernel in the maximum mean discrepancy (MMD) for fitting generative moment matching networks (GMMNs) is introduced, and its ability to improve the learning of copula random number generators is demonstrated. Based…

  • Privacy Auditing Synthetic Data Release through Local Likelihood Attacks

    Privacy Auditing Synthetic Data Release through Local Likelihood Attacks arXiv:2508.21146v1 Announce Type: cross Abstract: Auditing the privacy leakage of synthetic data is an important but unresolved problem. Most existing privacy auditing frameworks for synthetic data rely on heuristics and unreasonable assumptions to attack the failure modes of generative models, exhibiting limited capability to describe and…

  • BED-LLM: Intelligent Information Gathering with LLMs and Bayesian Experimental Design

    BED-LLM: Intelligent Information Gathering with LLMs and Bayesian Experimental Design arXiv:2508.21184v1 Announce Type: cross Abstract: We propose a general-purpose approach for improving the ability of Large Language Models (LLMs) to intelligently and adaptively gather information from a user or other external source using the framework of sequential Bayesian experimental design (BED). This enables LLMs to…

  • Stochastic Gradients under Nuisances

    Stochastic Gradients under Nuisances arXiv:2508.20326v1 Announce Type: new Abstract: Stochastic gradient optimization is the dominant learning paradigm for a variety of scenarios, from classical supervised learning to modern self-supervised learning. We consider stochastic gradient algorithms for learning problems whose objectives rely on unknown nuisance parameters, and establish non-asymptotic convergence guarantees. Our results show that, while…

  • Towards Trustworthy Amortized Bayesian Model Comparison

    Towards Trustworthy Amortized Bayesian Model Comparison arXiv:2508.20614v1 Announce Type: new Abstract: Amortized Bayesian model comparison (BMC) enables fast probabilistic ranking of models via simulation-based training of neural surrogates. However, the reliability of neural surrogates deteriorates when simulation models are misspecified – the very case where model comparison is most needed. Thus, we supplement simulation-based training…

  • Polynomial Chaos Expansion for Operator Learning

    Polynomial Chaos Expansion for Operator Learning arXiv:2508.20886v1 Announce Type: new Abstract: Operator learning (OL) has emerged as a powerful tool in scientific machine learning (SciML) for approximating mappings between infinite-dimensional functional spaces. One of its main applications is learning the solution operator of partial differential equations (PDEs). While much of the progress in this area…

  • Transfer Learning for Classification under Decision Rule Drift with Application to Optimal Individualized Treatment Rule Estimation

    Transfer Learning for Classification under Decision Rule Drift with Application to Optimal Individualized Treatment Rule Estimation arXiv:2508.20942v1 Announce Type: new Abstract: In this paper, we extend the transfer learning classification framework from regression function-based methods to decision rules. We propose a novel methodology for modeling posterior drift through Bayes decision rules. By exploiting the geometric…

  • Discovering equations from data: symbolic regression in dynamical systems

    Discovering equations from data: symbolic regression in dynamical systems arXiv:2508.20257v1 Announce Type: cross Abstract: The process of discovering equations from data lies at the heart of physics and in many other areas of research, including mathematical ecology and epidemiology. Recently, machine learning methods known as symbolic regression have automated this process. As several methods are…

  • Fractal Flow: Hierarchical and Interpretable Normalizing Flow via Topic Modeling and Recursive Strategy

    Fractal Flow: Hierarchical and Interpretable Normalizing Flow via Topic Modeling and Recursive Strategy arXiv:2508.19750v1 Announce Type: new Abstract: Normalizing Flows provide a principled framework for high-dimensional density estimation and generative modeling by constructing invertible transformations with tractable Jacobian determinants. We propose Fractal Flow, a novel normalizing flow architecture that enhances both expressiveness and interpretability through…

  • Conditional Normalizing Flow Surrogate for Monte Carlo Prediction of Radiative Properties in Nanoparticle-Embedded Layers

    Conditional Normalizing Flow Surrogate for Monte Carlo Prediction of Radiative Properties in Nanoparticle-Embedded Layers arXiv:2508.19841v1 Announce Type: new Abstract: We present a probabilistic, data-driven surrogate model for predicting the radiative properties of nanoparticle embedded scattering media. The model uses conditional normalizing flows, which learn the conditional distribution of optical outputs, including reflectance, absorbance, and transmittance,…

  • The Information Dynamics of Generative Diffusion

    The Information Dynamics of Generative Diffusion arXiv:2508.19897v1 Announce Type: new Abstract: Generative diffusion models have emerged as a powerful class of models in machine learning, yet a unified theoretical understanding of their operation is still developing. This perspective paper provides an integrated perspective on generative diffusion by connecting their dynamic, information-theoretic, and thermodynamic properties under…

  • Track Component Failure Detection Using Data Analytics over existing STDS Track Circuit data

    Track Component Failure Detection Using Data Analytics over existing STDS Track Circuit data arXiv:2508.11693v1 Announce Type: cross Abstract: Track Circuits (TC) are the main signalling devices used to detect the presence of a train on a rail track. It has been used since the 19th century and nowadays there are many types depending on the…

  • Physics-Informed Regression: Parameter Estimation in Parameter-Linear Nonlinear Dynamic Models

    Physics-Informed Regression: Parameter Estimation in Parameter-Linear Nonlinear Dynamic Models arXiv:2508.19249v1 Announce Type: cross Abstract: We present a new efficient hybrid parameter estimation method based on the idea, that if nonlinear dynamic models are stated in terms of a system of equations that is linear in terms of the parameters, then regularized ordinary least squares can…

  • Deterministic Coreset Construction via Adaptive Sensitivity Trimming

    Deterministic Coreset Construction via Adaptive Sensitivity Trimming arXiv:2508.18340v1 Announce Type: new Abstract: We develop a rigorous framework for deterministic coreset construction in empirical risk minimization (ERM). Our central contribution is the Adaptive Deterministic Uniform-Weight Trimming (ADUWT) algorithm, which constructs a coreset by excising points with the lowest sensitivity bounds and applying a data-dependent uniform weight…

  • Revisiting Follow-the-Perturbed-Leader with Unbounded Perturbations in Bandit Problems

    Revisiting Follow-the-Perturbed-Leader with Unbounded Perturbations in Bandit Problems arXiv:2508.18604v1 Announce Type: new Abstract: Follow-the-Regularized-Leader (FTRL) policies have achieved Best-of-Both-Worlds (BOBW) results in various settings through hybrid regularizers, whereas analogous results for Follow-the-Perturbed-Leader (FTPL) remain limited due to inherent analytical challenges. To advance the analytical foundations of FTPL, we revisit classical FTRL-FTPL duality for unbounded perturbations…

  • Efficient Best-of-Both-Worlds Algorithms for Contextual Combinatorial Semi-Bandits

    Efficient Best-of-Both-Worlds Algorithms for Contextual Combinatorial Semi-Bandits arXiv:2508.18768v1 Announce Type: new Abstract: We introduce the first best-of-both-worlds algorithm for contextual combinatorial semi-bandits that simultaneously guarantees $widetilde{mathcal{O}}(sqrt{T})$ regret in the adversarial regime and $widetilde{mathcal{O}}(ln T)$ regret in the corrupted stochastic regime. Our approach builds on the Follow-the-Regularized-Leader (FTRL) framework equipped with a Shannon entropy regularizer, yielding…

  • Sparse minimum Redundancy Maximum Relevance for feature selection

    Sparse minimum Redundancy Maximum Relevance for feature selection arXiv:2508.18901v1 Announce Type: new Abstract: We propose a feature screening method that integrates both feature-feature and feature-target relationships. Inactive features are identified via a penalized minimum Redundancy Maximum Relevance (mRMR) procedure, which is the continuous version of the classic mRMR penalized by a non-convex regularizer, and where…

  • Echoes of the past: A unified perspective on fading memory and echo states

    Echoes of the past: A unified perspective on fading memory and echo states arXiv:2508.19145v1 Announce Type: new Abstract: Recurrent neural networks (RNNs) have become increasingly popular in information processing tasks involving time series and temporal data. A fundamental property of RNNs is their ability to create reliable input/output responses, often linked to how the network…

  • GraphPPD: Posterior Predictive Modelling for Graph-Level Inference

    GraphPPD: Posterior Predictive Modelling for Graph-Level Inference arXiv:2508.16995v1 Announce Type: new Abstract: Accurate modelling and quantification of predictive uncertainty is crucial in deep learning since it allows a model to make safer decisions when the data is ambiguous and facilitates the users’ understanding of the model’s confidence in its predictions. Along with the tremendously increasing…

  • Limitations of refinement methods for weak to strong generalization

    Limitations of refinement methods for weak to strong generalization arXiv:2508.17018v1 Announce Type: new Abstract: Standard techniques for aligning large language models (LLMs) utilize human-produced data, which could limit the capability of any aligned LLM to human level. Label refinement and weak training have emerged as promising strategies to address this superalignment problem. In this work,…

  • CP4SBI: Local Conformal Calibration of Credible Sets in Simulation-Based Inference

    CP4SBI: Local Conformal Calibration of Credible Sets in Simulation-Based Inference arXiv:2508.17077v1 Announce Type: new Abstract: Current experimental scientists have been increasingly relying on simulation-based inference (SBI) to invert complex non-linear models with intractable likelihoods. However, posterior approximations obtained with SBI are often miscalibrated, causing credible regions to undercover true parameters. We develop $texttt{CP4SBI}$, a model-agnostic…

  • Neural Stochastic Differential Equations on Compact State-Spaces

    Neural Stochastic Differential Equations on Compact State-Spaces arXiv:2508.17090v1 Announce Type: new Abstract: Many modern probabilistic models rely on SDEs, but their adoption is hampered by instability, poor inductive bias outside bounded domains, and reliance on restrictive dynamics or training tricks. While recent work constrains SDEs to compact spaces using reflected dynamics, these approaches lack continuous…

  • Rao Differential Privacy

    Rao Differential Privacy arXiv:2508.17135v1 Announce Type: new Abstract: Differential privacy (DP) has recently emerged as a definition of privacy to release private estimates. DP calibrates noise to be on the order of an individuals contribution. Due to the this calibration a private estimate obscures any individual while preserving the utility of the estimate. Since the…

  • Interpretable Kernels

    Interpretable Kernels arXiv:2508.15932v1 Announce Type: new Abstract: The use of kernels for nonlinear prediction is widespread in machine learning. They have been popularized in support vector machines and used in kernel ridge regression, amongst others. Kernel methods share three aspects. First, instead of the original matrix of predictor variables or features, each observation is mapped…

  • Optimal Dynamic Regret by Transformers for Non-Stationary Reinforcement Learning

    Optimal Dynamic Regret by Transformers for Non-Stationary Reinforcement Learning arXiv:2508.16027v1 Announce Type: new Abstract: Transformers have demonstrated exceptional performance across a wide range of domains. While their ability to perform reinforcement learning in-context has been established both theoretically and empirically, their behavior in non-stationary environments remains less understood. In this study, we address this gap…

  • A Sharp KL-Convergence Analysis for Diffusion Models under Minimal Assumptions

    A Sharp KL-Convergence Analysis for Diffusion Models under Minimal Assumptions arXiv:2508.16306v1 Announce Type: new Abstract: Diffusion-based generative models have emerged as highly effective methods for synthesizing high-quality samples. Recent works have focused on analyzing the convergence of their generation process with minimal assumptions, either through reverse SDEs or Probability Flow ODEs. The best known guarantees,…

  • Deep Intrinsic Coregionalization Multi-Output Gaussian Process Surrogate with Active Learning

    Deep Intrinsic Coregionalization Multi-Output Gaussian Process Surrogate with Active Learning arXiv:2508.16434v1 Announce Type: new Abstract: Deep Gaussian Processes (DGPs) are powerful surrogate models known for their flexibility and ability to capture complex functions. However, extending them to multi-output settings remains challenging due to the need for efficient dependency modeling. We propose the Deep Intrinsic Coregionalization…

  • Underdamped Langevin MCMC with third order convergence

    Underdamped Langevin MCMC with third order convergence arXiv:2508.16485v1 Announce Type: new Abstract: In this paper, we propose a new numerical method for the underdamped Langevin diffusion (ULD) and present a non-asymptotic analysis of its sampling error in the 2-Wasserstein distance when the $d$-dimensional target distribution $p(x)propto e^{-f(x)}$ is strongly log-concave and has varying degrees of…

  • Kernel-based Equalized Odds: A Quantification of Accuracy-Fairness Trade-off in Fair Representation Learning

    Kernel-based Equalized Odds: A Quantification of Accuracy-Fairness Trade-off in Fair Representation Learning arXiv:2508.15084v1 Announce Type: new Abstract: This paper introduces a novel kernel-based formulation of the Equalized Odds (EO) criterion, denoted as $EO_k$, for fair representation learning (FRL) in supervised settings. The central goal of FRL is to mitigate discrimination regarding a sensitive attribute $S$…

  • Bayesian Inference and Learning in Nonlinear Dynamical Systems: A Framework for Incorporating Explicit and Implicit Prior Knowledge

    Bayesian Inference and Learning in Nonlinear Dynamical Systems: A Framework for Incorporating Explicit and Implicit Prior Knowledge arXiv:2508.15345v1 Announce Type: new Abstract: Accuracy and generalization capabilities are key objectives when learning dynamical system models. To obtain such models from limited data, current works exploit prior knowledge and assumptions about the system. However, the fusion of…

  • Bayesian Optimization with Expected Improvement: No Regret and the Choice of Incumbent

    Bayesian Optimization with Expected Improvement: No Regret and the Choice of Incumbent arXiv:2508.15674v1 Announce Type: new Abstract: Expected improvement (EI) is one of the most widely used acquisition functions in Bayesian optimization (BO). Despite its proven empirical success in applications, the cumulative regret upper bound of EI remains an open question. In this paper, we…

  • Tree-like Pairwise Interaction Networks

    Tree-like Pairwise Interaction Networks arXiv:2508.15678v1 Announce Type: new Abstract: Modeling feature interactions in tabular data remains a key challenge in predictive modeling, for example, as used for insurance pricing. This paper proposes the Tree-like Pairwise Interaction Network (PIN), a novel neural network architecture that explicitly captures pairwise feature interactions through a shared feed-forward neural network…

  • Can synthetic data reproduce real-world findings in epidemiology? A replication study using tree-based generative AI

    Can synthetic data reproduce real-world findings in epidemiology? A replication study using tree-based generative AI arXiv:2508.14936v1 Announce Type: cross Abstract: Generative artificial intelligence for synthetic data generation holds substantial potential to address practical challenges in epidemiology. However, many current methods suffer from limited quality, high computational demands, and complexity for non-experts. Furthermore, common evaluation strategies…

  • Comparing Model-agnostic Feature Selection Methods through Relative Efficiency

    Comparing Model-agnostic Feature Selection Methods through Relative Efficiency arXiv:2508.14268v1 Announce Type: new Abstract: Feature selection and importance estimation in a model-agnostic setting is an ongoing challenge of significant interest. Wrapper methods are commonly used because they are typically model-agnostic, even though they are computationally intensive. In this paper, we focus on feature selection methods related…

  • Evaluation and Optimization of Leave-one-out Cross-validation for the Lasso

    Evaluation and Optimization of Leave-one-out Cross-validation for the Lasso arXiv:2508.14368v1 Announce Type: new Abstract: I develop an algorithm to produce the piecewise quadratic that computes leave-one-out cross-validation for the lasso as a function of its hyperparameter. The algorithm can be used to find exact hyperparameters that optimize leave-one-out cross-validation either globally or locally, and its…

  • The C-index Multiverse

    The C-index Multiverse arXiv:2508.14821v1 Announce Type: new Abstract: Quantifying out-of-sample discrimination performance for time-to-event outcomes is a fundamental step for model evaluation and selection in the context of predictive modelling. The concordance index, or C-index, is a widely used metric for this purpose, particularly with the growing development of machine learning methods. Beyond differences between…

  • Noise Robust One-Class Intrusion Detection on Dynamic Graphs

    Noise Robust One-Class Intrusion Detection on Dynamic Graphs arXiv:2508.14192v1 Announce Type: cross Abstract: In the domain of network intrusion detection, robustness against contaminated and noisy data inputs remains a critical challenge. This study introduces a probabilistic version of the Temporal Graph Network Support Vector Data Description (TGN-SVDD) model, designed to enhance detection accuracy in the…

  • Optimal Subspace Embeddings: Resolving Nelson-Nguyen Conjecture Up to Sub-Polylogarithmic Factors

    Optimal Subspace Embeddings: Resolving Nelson-Nguyen Conjecture Up to Sub-Polylogarithmic Factors arXiv:2508.14234v1 Announce Type: cross Abstract: We give a proof of the conjecture of Nelson and Nguyen [FOCS 2013] on the optimal dimension and sparsity of oblivious subspace embeddings, up to sub-polylogarithmic factors: For any $ngeq d$ and $epsilongeq d^{-O(1)}$, there is a random $tilde O(d/epsilon^2)times…

  • Preference Models assume Proportional Hazards of Utilities

    Preference Models assume Proportional Hazards of Utilities arXiv:2508.13189v1 Announce Type: new Abstract: Approaches for estimating preferences from human annotated data typically involves inducing a distribution over a ranked list of choices such as the Plackett-Luce model. Indeed, modern AI alignment tools such as Reward Modelling and Direct Preference Optimization are based on the statistical assumptions…

  • Flow Matching-Based Generative Modeling for Efficient and Scalable Data Assimilation

    Flow Matching-Based Generative Modeling for Efficient and Scalable Data Assimilation arXiv:2508.13313v1 Announce Type: new Abstract: Data assimilation (DA) is the problem of sequentially estimating the state of a dynamical system from noisy observations. Recent advances in generative modeling have inspired new approaches to DA in high-dimensional nonlinear settings, especially the ensemble score filter (EnSF). However,…

  • Structural Foundations for Leading Digit Laws: Beyond Probabilistic Mixtures

    Structural Foundations for Leading Digit Laws: Beyond Probabilistic Mixtures arXiv:2508.13237v1 Announce Type: new Abstract: This article presents a modern deterministic framework for the study of leading significant digit distributions in numerical data. Rather than relying on traditional probabilistic or mixture-based explanations, we demonstrate that the observed frequencies of leading digits are determined by the underlying…

  • Smooth Flow Matching

    Smooth Flow Matching arXiv:2508.13831v1 Announce Type: new Abstract: Functional data, i.e., smooth random functions observed over a continuous domain, are increasingly available in areas such as biomedical research, health informatics, and epidemiology. However, effective statistical analysis for functional data is often hindered by challenges such as privacy constraints, sparse and irregular sampling, infinite dimensionality, and…

  • Online Conformal Selection with Accept-to-Reject Changes

    Online Conformal Selection with Accept-to-Reject Changes arXiv:2508.13838v1 Announce Type: new Abstract: Selecting a subset of promising candidates from a large pool is crucial across various scientific and real-world applications. Conformal selection offers a distribution-free and model-agnostic framework for candidate selection with uncertainty quantification. While effective in offline settings, its application to online scenarios, where data…

  • BaMANI: Bayesian Multi-Algorithm causal Network Inference

    BaMANI: Bayesian Multi-Algorithm causal Network Inference arXiv:2508.11741v1 Announce Type: new Abstract: Improved computational power has enabled different disciplines to predict causal relationships among modeled variables using Bayesian network inference. While many alternative algorithms have been proposed to improve the efficiency and reliability of network prediction, the predicted causal networks reflect the generative process but also…

  • Dropping Just a Handful of Preferences Can Change Top Large Language Model Rankings

    Dropping Just a Handful of Preferences Can Change Top Large Language Model Rankings arXiv:2508.11847v1 Announce Type: new Abstract: We propose a method for evaluating the robustness of a widely used LLM ranking system — the Bradley–Terry ranking system — to dropping a worst-case very small fraction of evaluation data. Our approach is computationally fast and…

  • Robust Data Fusion via Subsampling

    Robust Data Fusion via Subsampling arXiv:2508.12048v1 Announce Type: new Abstract: Data fusion and transfer learning are rapidly growing fields that enhance model performance for a target population by leveraging other related data sources or tasks. The challenges lie in the various potential heterogeneities between the target and external data, as well as various practical concerns…

  • An Introduction to Sliced Optimal Transport

    An Introduction to Sliced Optimal Transport arXiv:2508.12519v1 Announce Type: new Abstract: Sliced Optimal Transport (SOT) is a rapidly developing branch of optimal transport (OT) that exploits the tractability of one-dimensional OT problems. By combining tools from OT, integral geometry, and computational statistics, SOT enables fast and scalable computation of distances, barycenters, and kernels for probability…

  • On computing and the complexity of computing higher-order $U$-statistics, exactly

    On computing and the complexity of computing higher-order $U$-statistics, exactly arXiv:2508.12627v1 Announce Type: new Abstract: Higher-order $U$-statistics abound in fields such as statistics, machine learning, and computer science, but are known to be highly time-consuming to compute in practice. Despite their widespread appearance, a comprehensive study of their computational complexity is surprisingly lacking. This paper…

  • Non-asymptotic convergence bound of conditional diffusion models

    Non-asymptotic convergence bound of conditional diffusion models arXiv:2508.10944v1 Announce Type: new Abstract: Learning and generating various types of data based on conditional diffusion models has been a research hotspot in recent years. Although conditional diffusion models have made considerable progress in improving acceleration algorithms and enhancing generation quality, the lack of non-asymptotic properties has hindered…

  • Counterfactual Survival Q Learning for Longitudinal Randomized Trials via Buckley James Boosting

    Counterfactual Survival Q Learning for Longitudinal Randomized Trials via Buckley James Boosting arXiv:2508.11060v1 Announce Type: new Abstract: We propose a Buckley James (BJ) Boost Q learning framework for estimating optimal dynamic treatment regimes under right censored survival data, tailored for longitudinal randomized clinical trial settings. The method integrates accelerated failure time models with iterative boosting…

  • Uniform convergence for Gaussian kernel ridge regression

    Uniform convergence for Gaussian kernel ridge regression arXiv:2508.11274v1 Announce Type: new Abstract: This paper establishes the first polynomial convergence rates for Gaussian kernel ridge regression (KRR) with a fixed hyperparameter in both the uniform and the $L^{2}$-norm. The uniform convergence result closes a gap in the theoretical understanding of KRR with the Gaussian kernel, where…

  • ADMIRE-BayesOpt: Accelerated Data MIxture RE-weighting for Language Models with Bayesian Optimization

    ADMIRE-BayesOpt: Accelerated Data MIxture RE-weighting for Language Models with Bayesian Optimization arXiv:2508.11551v1 Announce Type: new Abstract: Determining the optimal data mixture for large language model training remains a challenging problem with an outsized impact on performance. In practice, language model developers continue to rely on heuristic exploration since no learning-based approach has emerged as a…

  • Nonparametric learning of stochastic differential equations from sparse and noisy data

    Nonparametric learning of stochastic differential equations from sparse and noisy data arXiv:2508.11597v1 Announce Type: new Abstract: The paper proposes a systematic framework for building data-driven stochastic differential equation (SDE) models from sparse, noisy observations. Unlike traditional parametric approaches, which assume a known functional form for the drift, our goal here is to learn the entire…

  • Prediction-Powered Inference with Inverse Probability Weighting

    Prediction-Powered Inference with Inverse Probability Weighting arXiv:2508.10149v1 Announce Type: new Abstract: Prediction-powered inference (PPI) is a recent framework for valid statistical inference with partially labeled data, combining model-based predictions on a large unlabeled set with bias correction from a smaller labeled subset. We show that PPI can be extended to handle informative labeling by replacing…

  • Mo’ Memory, Mo’ Problems: Stream-Native Machine Unlearning

    Mo’ Memory, Mo’ Problems: Stream-Native Machine Unlearning arXiv:2508.10193v1 Announce Type: new Abstract: Machine unlearning work assumes a static, i.i.d training environment that doesn’t truly exist. Modern ML pipelines need to learn, unlearn, and predict continuously on production streams of data. We translate the notion of the batch unlearning scenario to the online setting using notions…

  • Dimension-Free Bounds for Generalized First-Order Methods via Gaussian Coupling

    Dimension-Free Bounds for Generalized First-Order Methods via Gaussian Coupling arXiv:2508.10782v1 Announce Type: new Abstract: We establish non-asymptotic bounds on the finite-sample behavior of generalized first-order iterative algorithms — including gradient-based optimization methods and approximate message passing (AMP) — with Gaussian data matrices and full-memory, non-separable nonlinearities. The central result constructs an explicit coupling between the…

  • Conic Formulations of Transport Metrics for Unbalanced Measure Networks and Hypernetworks

    Conic Formulations of Transport Metrics for Unbalanced Measure Networks and Hypernetworks arXiv:2508.10888v1 Announce Type: new Abstract: The Gromov-Wasserstein (GW) variant of optimal transport, designed to compare probability densities defined over distinct metric spaces, has emerged as an important tool for the analysis of data with complex structure, such as ensembles of point clouds or networks.…

  • An Iterative Algorithm for Differentially Private $k$-PCA with Adaptive Noise

    An Iterative Algorithm for Differentially Private $k$-PCA with Adaptive Noise arXiv:2508.10879v1 Announce Type: new Abstract: Given $n$ i.i.d. random matrices $A_i in mathbb{R}^{d times d}$ that share a common expectation $Sigma$, the objective of Differentially Private Stochastic PCA is to identify a subspace of dimension $k$ that captures the largest variance directions of $Sigma$, while…

  • Distributional Sensitivity Analysis: Enabling Differentiability in Sample-Based Inference

    Distributional Sensitivity Analysis: Enabling Differentiability in Sample-Based Inference arXiv:2508.09347v1 Announce Type: new Abstract: We present two analytical formulae for estimating the sensitivity — namely, the gradient or Jacobian — at given realizations of an arbitrary-dimensional random vector with respect to its distributional parameters. The first formula interprets this sensitivity as partial derivatives of the inverse…

  • A pseudo-inverse of a line graph

    A pseudo-inverse of a line graph arXiv:2508.09412v1 Announce Type: new Abstract: Line graphs are an alternative representation of graphs where each vertex of the original (root) graph becomes an edge. However not all graphs have a corresponding root graph, hence the transformation from graphs to line graphs is not invertible. We investigate the case when…

  • Scalable h-adaptive probabilistic solver for time-independent and time-dependent systems

    Scalable h-adaptive probabilistic solver for time-independent and time-dependent systems arXiv:2508.09623v1 Announce Type: new Abstract: Solving partial differential equations (PDEs) within the framework of probabilistic numerics offers a principled approach to quantifying epistemic uncertainty arising from discretization. By leveraging Gaussian process regression and imposing the governing PDE as a constraint at a finite set of collocation…

  • Structured Kernel Regression VAE: A Computationally Efficient Surrogate for GP-VAEs in ICA

    Structured Kernel Regression VAE: A Computationally Efficient Surrogate for GP-VAEs in ICA arXiv:2508.09721v1 Announce Type: new Abstract: The interpretability of generative models is considered a key factor in demonstrating their effectiveness and controllability. The generated data are believed to be determined by latent variables that are not directly observable. Therefore, disentangling, decoupling, decomposing, causal inference,…

  • Objective Soups: Multilingual Multi-Task Modeling for Speech Processing

    Objective Soups: Multilingual Multi-Task Modeling for Speech Processing arXiv:2508.09228v1 Announce Type: cross Abstract: Training a single model for multilingual, multi-task speech processing (MSP) is severely hampered by conflicting objectives between tasks like speech recognition and translation. While multi-objective optimization (MOO) aims to align gradient updates, its effectiveness diminishes as the number of tasks grows, making…