Category: cs.LG

Hedging with memory: shallow and deep learning with signatures

Hedging with memory: shallow and deep learning with signatures arXiv:2508.02759v1 Announce Type: new Abstract: We investigate the use of path signatures in a machine learning context for hedging exotic derivatives under non-Markovian stochastic volatility models. In a deep learning setting, we use signatures as features in feedforward neural networks and show that they outperform LSTMs…

August 6, 2025
Supervised Dynamic Dimension Reduction with Deep Neural Network

Supervised Dynamic Dimension Reduction with Deep Neural Network arXiv:2508.03546v1 Announce Type: new Abstract: This paper studies the problem of dimension reduction, tailored to improving time series forecasting with high-dimensional predictors. We propose a novel Supervised Deep Dynamic Principal component analysis (SDDP) framework that incorporates the target variable and lagged observations into the factor extraction process.…

August 6, 2025
Likelihood Matching for Diffusion Models

Likelihood Matching for Diffusion Models arXiv:2508.03636v1 Announce Type: new Abstract: We propose a Likelihood Matching approach for training diffusion models by first establishing an equivalence between the likelihood of the target data distribution and a likelihood along the sample path of the reverse diffusion. To efficiently compute the reverse sample likelihood, a quasi-likelihood is considered…

August 6, 2025
Learning quadratic neural networks in high dimensions: SGD dynamics and scaling laws

Learning quadratic neural networks in high dimensions: SGD dynamics and scaling laws arXiv:2508.03688v1 Announce Type: new Abstract: We study the optimization and sample complexity of gradient-based training of a two-layer neural network with quadratic activation function in the high-dimensional regime, where the data is generated as $y propto sum_{j=1}^{r}lambda_j sigmaleft(langle boldsymbol{theta_j}, boldsymbol{x}rangleright), boldsymbol{x} sim N(0,boldsymbol{I}_d)$,…

August 6, 2025
Uncertainty Quantification for Large-Scale Deep Networks via Post-StoNet Modeling

Uncertainty Quantification for Large-Scale Deep Networks via Post-StoNet Modeling arXiv:2508.01217v1 Announce Type: new Abstract: Deep learning has revolutionized modern data science. However, how to accurately quantify the uncertainty of predictions from large-scale deep neural networks (DNNs) remains an unresolved issue. To address this issue, we introduce a novel post-processing approach. This approach feeds the output…

August 5, 2025
Inequalities for Optimization of Classification Algorithms: A Perspective Motivated by Diagnostic Testing

Inequalities for Optimization of Classification Algorithms: A Perspective Motivated by Diagnostic Testing arXiv:2508.01065v1 Announce Type: new Abstract: Motivated by canonical problems in medical diagnostics, we propose and study properties of an objective function that uniformly bounds uncertainties in quantities of interest extracted from classifiers and related data analysis tools. We begin by adopting a set-theoretic…

August 5, 2025
Flow IV: Counterfactual Inference In Nonseparable Outcome Models Using Instrumental Variables

Flow IV: Counterfactual Inference In Nonseparable Outcome Models Using Instrumental Variables arXiv:2508.01321v1 Announce Type: new Abstract: To reach human level intelligence, learning algorithms need to incorporate causal reasoning. But identifying causality, and particularly counterfactual reasoning, remains an elusive task. In this paper, we make progress on this task by utilizing instrumental variables (IVs). IVs are…

August 5, 2025
Debiasing Machine Learning Predictions for Causal Inference Without Additional Ground Truth Data: “One Map, Many Trials” in Satellite-Driven Poverty Analysis

Debiasing Machine Learning Predictions for Causal Inference Without Additional Ground Truth Data: “One Map, Many Trials” in Satellite-Driven Poverty Analysis arXiv:2508.01341v1 Announce Type: new Abstract: Machine learning models trained on Earth observation data, such as satellite imagery, have demonstrated significant promise in predicting household-level wealth indices, enabling the creation of high-resolution wealth maps that can…

August 5, 2025
Efficient optimization of expensive black-box simulators via marginal means, with application to neutrino detector design

Efficient optimization of expensive black-box simulators via marginal means, with application to neutrino detector design arXiv:2508.01834v1 Announce Type: new Abstract: With advances in scientific computing, computer experiments are increasingly used for optimizing complex systems. However, for modern applications, e.g., the optimization of nuclear physics detectors, each experiment run can require hundreds of CPU hours, making…

August 5, 2025
funOCLUST: Clustering Functional Data with Outliers

funOCLUST: Clustering Functional Data with Outliers arXiv:2508.00110v1 Announce Type: new Abstract: Functional data present unique challenges for clustering due to their infinite-dimensional nature and potential sensitivity to outliers. An extension of the OCLUST algorithm to the functional setting is proposed to address these issues. The approach leverages the OCLUST framework, creating a robust method to…

August 4, 2025
Sinusoidal Approximation Theorem for Kolmogorov-Arnold Networks

Sinusoidal Approximation Theorem for Kolmogorov-Arnold Networks arXiv:2508.00247v1 Announce Type: new Abstract: The Kolmogorov-Arnold representation theorem states that any continuous multivariable function can be exactly represented as a finite superposition of continuous single variable functions. Subsequent simplifications of this representation involve expressing these functions as parameterized sums of a smaller number of unique monotonic functions. These…

August 4, 2025
DO-EM: Density Operator Expectation Maximization

DO-EM: Density Operator Expectation Maximization arXiv:2507.22786v1 Announce Type: cross Abstract: Density operators, quantum generalizations of probability distributions, are gaining prominence in machine learning due to their foundational role in quantum computing. Generative modeling based on density operator models (textbf{DOMs}) is an emerging field, but existing training algorithms — such as those for the Quantum Boltzmann…

August 4, 2025
Regime-Aware Conditional Neural Processes with Multi-Criteria Decision Support for Operational Electricity Price Forecasting

Regime-Aware Conditional Neural Processes with Multi-Criteria Decision Support for Operational Electricity Price Forecasting arXiv:2508.00040v1 Announce Type: cross Abstract: This work integrates Bayesian regime detection with conditional neural processes for 24-hour electricity price prediction in the German market. Our methodology integrates regime detection using a disentangled sticky hierarchical Dirichlet process hidden Markov model (DS-HDP-HMM) applied to…

August 4, 2025
A Smoothing Newton Method for Rank-one Matrix Recovery

A Smoothing Newton Method for Rank-one Matrix Recovery arXiv:2507.23017v1 Announce Type: new Abstract: We consider the phase retrieval problem, which involves recovering a rank-one positive semidefinite matrix from rank-one measurements. A recently proposed algorithm based on Bures-Wasserstein gradient descent (BWGD) exhibits superlinear convergence, but it is unstable, and existing theory can only prove local linear…

August 1, 2025
Optimal Transport Learning: Balancing Value Optimization and Fairness in Individualized Treatment Rules

Optimal Transport Learning: Balancing Value Optimization and Fairness in Individualized Treatment Rules arXiv:2507.23349v1 Announce Type: new Abstract: Individualized treatment rules (ITRs) have gained significant attention due to their wide-ranging applications in fields such as precision medicine, ridesharing, and advertising recommendations. However, when ITRs are influenced by sensitive attributes such as race, gender, or age, they…

August 1, 2025
DICOM De-Identification via Hybrid AI and Rule-Based Framework for Scalable, Uncertainty-Aware Redaction

DICOM De-Identification via Hybrid AI and Rule-Based Framework for Scalable, Uncertainty-Aware Redaction arXiv:2507.23736v1 Announce Type: new Abstract: Access to medical imaging and associated text data has the potential to drive major advances in healthcare research and patient outcomes. However, the presence of Protected Health Information (PHI) and Personally Identifiable Information (PII) in Digital Imaging and…

August 1, 2025
Scaled Beta Models and Feature Dilution for Dynamic Ticket Pricing

Scaled Beta Models and Feature Dilution for Dynamic Ticket Pricing arXiv:2507.23767v1 Announce Type: new Abstract: A novel approach is presented for identifying distinct signatures of performing acts in the secondary ticket resale market by analyzing dynamic pricing distributions. Using a newly curated, time series dataset from the SeatGeek API, we model ticket pricing distributions as…

August 1, 2025
Formal Bayesian Transfer Learning via the Total Risk Prior

Formal Bayesian Transfer Learning via the Total Risk Prior arXiv:2507.23768v1 Announce Type: new Abstract: In analyses with severe data-limitations, augmenting the target dataset with information from ancillary datasets in the application domain, called source datasets, can lead to significantly improved statistical procedures. However, existing methods for this transfer learning struggle to deal with situations where…

August 1, 2025
Simulating Posterior Bayesian Neural Networks with Dependent Weights

Simulating Posterior Bayesian Neural Networks with Dependent Weights arXiv:2507.22095v1 Announce Type: new Abstract: In this paper we consider posterior Bayesian fully connected and feedforward deep neural networks with dependent weights. Particularly, if the likelihood is Gaussian, we identify the distribution of the wide width limit and provide an algorithm to sample from the network. In…

July 31, 2025
Stacked SVD or SVD stacked? A Random Matrix Theory perspective on data integration

Stacked SVD or SVD stacked? A Random Matrix Theory perspective on data integration arXiv:2507.22170v1 Announce Type: new Abstract: Modern data analysis increasingly requires identifying shared latent structure across multiple high-dimensional datasets. A commonly used model assumes that the data matrices are noisy observations of low-rank matrices with a shared singular subspace. In this case, two…

July 31, 2025
LVM-GP: Uncertainty-Aware PDE Solver via coupling latent variable model and Gaussian process

LVM-GP: Uncertainty-Aware PDE Solver via coupling latent variable model and Gaussian process arXiv:2507.22493v1 Announce Type: new Abstract: We propose a novel probabilistic framework, termed LVM-GP, for uncertainty quantification in solving forward and inverse partial differential equations (PDEs) with noisy data. The core idea is to construct a stochastic mapping from the input to a high-dimensional…

July 31, 2025
Subgrid BoostCNN: Efficient Boosting of Convolutional Networks via Gradient-Guided Feature Selection

Subgrid BoostCNN: Efficient Boosting of Convolutional Networks via Gradient-Guided Feature Selection arXiv:2507.22842v1 Announce Type: new Abstract: Convolutional Neural Networks (CNNs) have achieved remarkable success across a wide range of machine learning tasks by leveraging hierarchical feature learning through deep architectures. However, the large number of layers and millions of parameters often make CNNs computationally expensive…

July 31, 2025
A Unified Analysis of Generalization and Sample Complexity for Semi-Supervised Domain Adaptation

A Unified Analysis of Generalization and Sample Complexity for Semi-Supervised Domain Adaptation arXiv:2507.22632v1 Announce Type: new Abstract: Domain adaptation seeks to leverage the abundant label information in a source domain to improve classification performance in a target domain with limited labels. While the field has seen extensive methodological development, its theoretical foundations remain relatively underexplored.…

July 31, 2025
Graph neural networks for residential location choice: connection to classical logit models

Graph neural networks for residential location choice: connection to classical logit models arXiv:2507.21334v1 Announce Type: new Abstract: Researchers have adopted deep learning for classical discrete choice analysis as it can capture complex feature relationships and achieve higher predictive performance. However, the existing deep learning approaches cannot explicitly capture the relationship among choice alternatives, which has…

July 30, 2025
From Sublinear to Linear: Fast Convergence in Deep Networks via Locally Polyak-Lojasiewicz Regions

From Sublinear to Linear: Fast Convergence in Deep Networks via Locally Polyak-Lojasiewicz Regions arXiv:2507.21429v1 Announce Type: new Abstract: The convergence of gradient descent (GD) on the non-convex loss landscapes of deep neural networks (DNNs) presents a fundamental theoretical challenge. While recent work has established that GD converges to a stationary point at a sublinear rate…

July 30, 2025
From Global to Local: A Scalable Benchmark for Local Posterior Sampling

From Global to Local: A Scalable Benchmark for Local Posterior Sampling arXiv:2507.21449v1 Announce Type: new Abstract: Degeneracy is an inherent feature of the loss landscape of neural networks, but it is not well understood how stochastic gradient MCMC (SGMCMC) algorithms interact with this degeneracy. In particular, current global convergence guarantees for common SGMCMC algorithms rely…

July 30, 2025
Measuring Sample Quality with Copula Discrepancies

Measuring Sample Quality with Copula Discrepancies arXiv:2507.21434v1 Announce Type: new Abstract: The scalable Markov chain Monte Carlo (MCMC) algorithms that underpin modern Bayesian machine learning, such as Stochastic Gradient Langevin Dynamics (SGLD), sacrifice asymptotic exactness for computational speed, creating a critical diagnostic gap: traditional sample quality measures fail catastrophically when applied to biased samplers. While…

July 30, 2025
Stochastic forest transition model dynamics and parameter estimation via deep learning

Stochastic forest transition model dynamics and parameter estimation via deep learning arXiv:2507.21486v1 Announce Type: new Abstract: Forest transitions, characterized by dynamic shifts between forest, agricultural, and abandoned lands, are complex phenomena. This study developed a stochastic differential equation model to capture the intricate dynamics of these transitions. We established the existence of global positive solutions…

July 30, 2025
Bayesian symbolic regression: Automated equation discovery from a physicists’ perspective

Bayesian symbolic regression: Automated equation discovery from a physicists’ perspective arXiv:2507.19540v1 Announce Type: new Abstract: Symbolic regression automates the process of learning closed-form mathematical models from data. Standard approaches to symbolic regression, as well as newer deep learning approaches, rely on heuristic model selection criteria, heuristic regularization, and heuristic exploration of model space. Here, we…

July 29, 2025
Adaptive Bayesian Data-Driven Design of Reliable Solder Joints for Micro-electronic Devices

Adaptive Bayesian Data-Driven Design of Reliable Solder Joints for Micro-electronic Devices arXiv:2507.19663v1 Announce Type: new Abstract: Solder joint reliability related to failures due to thermomechanical loading is a critically important yet physically complex engineering problem. As a result, simulated behavior is oftentimes computationally expensive. In an increasingly data-driven world, the usage of efficient data-driven design…

July 29, 2025
Sparse-mode Dynamic Mode Decomposition for Disambiguating Local and Global Structures

Sparse-mode Dynamic Mode Decomposition for Disambiguating Local and Global Structures arXiv:2507.19787v1 Announce Type: new Abstract: The dynamic mode decomposition (DMD) is a data-driven approach that extracts the dominant features from spatiotemporal data. In this work, we introduce sparse-mode DMD, a new variant of the optimized DMD framework that specifically leverages sparsity-promoting regularization in order to…

July 29, 2025
Bag of Coins: A Statistical Probe into Neural Confidence Structures

Bag of Coins: A Statistical Probe into Neural Confidence Structures arXiv:2507.19774v1 Announce Type: new Abstract: Modern neural networks, despite their high accuracy, often produce poorly calibrated confidence scores, limiting their reliability in high-stakes applications. Existing calibration methods typically post-process model outputs without interrogating the internal consistency of the predictions themselves. In this work, we introduce…

July 29, 2025
Predicting Parkinson’s Disease Progression Using Statistical and Neural Mixed Effects Models: A Comparative Study on Longitudinal Biomarkers

Predicting Parkinson’s Disease Progression Using Statistical and Neural Mixed Effects Models: A Comparative Study on Longitudinal Biomarkers arXiv:2507.20058v1 Announce Type: new Abstract: Predicting Parkinson’s Disease (PD) progression is crucial, and voice biomarkers offer a non-invasive method for tracking symptom severity (UPDRS scores) through telemonitoring. Analyzing this longitudinal data is challenging due to within-subject correlations and…

July 29, 2025
Central limit theorems for the eigenvalues of graph Laplacians on data clouds

Central limit theorems for the eigenvalues of graph Laplacians on data clouds arXiv:2507.18803v1 Announce Type: new Abstract: Given i.i.d. samples $X_n ={ x_1, dots, x_n }$ from a distribution supported on a low dimensional manifold ${M}$ embedded in Eucliden space, we consider the graph Laplacian operator $Delta_n$ associated to an $varepsilon$-proximity graph over $X_n$ and…

July 28, 2025
Perfect Clustering in Very Sparse Diverse Multiplex Networks

Perfect Clustering in Very Sparse Diverse Multiplex Networks arXiv:2507.19423v1 Announce Type: new Abstract: The paper studies the DIverse MultiPLEx Signed Generalized Random Dot Product Graph (DIMPLE-SGRDPG) network model (Pensky (2024)), where all layers of the network have the same collection of nodes. In addition, all layers can be partitioned into groups such that the layers…

July 28, 2025
Probably Approximately Correct Causal Discovery

Probably Approximately Correct Causal Discovery arXiv:2507.18903v1 Announce Type: new Abstract: The discovery of causal relationships is a foundational problem in artificial intelligence, statistics, epidemiology, economics, and beyond. While elegant theories exist for accurate causal discovery given infinite data, real-world applications are inherently resource-constrained. Effective methods for inferring causal relationships from observational data must perform well…

July 28, 2025
Sliding Window Informative Canonical Correlation Analysis

Sliding Window Informative Canonical Correlation Analysis arXiv:2507.17921v1 Announce Type: new Abstract: Canonical correlation analysis (CCA) is a technique for finding correlated sets of features between two datasets. In this paper, we propose a novel extension of CCA to the online, streaming data setting: Sliding Window Informative Canonical Correlation Analysis (SWICCA). Our method uses a streaming…

July 25, 2025
A Two-armed Bandit Framework for A/B Testing

A Two-armed Bandit Framework for A/B Testing arXiv:2507.18118v1 Announce Type: new Abstract: A/B testing is widely used in modern technology companies for policy evaluation and product deployment, with the goal of comparing the outcomes under a newly-developed policy against a standard control. Various causal inference and reinforcement learning methods developed in the literature are applicable…

July 25, 2025
On Reconstructing Training Data From Bayesian Posteriors and Trained Models

On Reconstructing Training Data From Bayesian Posteriors and Trained Models arXiv:2507.18372v1 Announce Type: new Abstract: Publicly releasing the specification of a model with its trained parameters means an adversary can attempt to reconstruct information about the training data via training data reconstruction attacks, a major vulnerability of modern machine learning methods. This paper makes three…

July 25, 2025
DriftMoE: A Mixture of Experts Approach to Handle Concept Drifts

DriftMoE: A Mixture of Experts Approach to Handle Concept Drifts arXiv:2507.18464v1 Announce Type: new Abstract: Learning from non-stationary data streams subject to concept drift requires models that can adapt on-the-fly while remaining resource-efficient. Existing adaptive ensemble methods often rely on coarse-grained adaptation mechanisms or simple voting schemes that fail to optimally leverage specialized knowledge. This…

July 25, 2025
Fundamental limits of distributed covariance matrix estimation via a conditional strong data processing inequality

Fundamental limits of distributed covariance matrix estimation via a conditional strong data processing inequality arXiv:2507.16953v1 Announce Type: new Abstract: Estimating high-dimensional covariance matrices is a key task across many fields. This paper explores the theoretical limits of distributed covariance estimation in a feature-split setting, where communication between agents is constrained. Specifically, we study a scenario…

July 24, 2025
Bayesian preference elicitation for decision support in multiobjective optimization

Bayesian preference elicitation for decision support in multiobjective optimization arXiv:2507.16999v1 Announce Type: new Abstract: We present a novel approach to help decision-makers efficiently identify preferred solutions from the Pareto set of a multi-objective optimization problem. Our method uses a Bayesian model to estimate the decision-maker’s utility function based on pairwise comparisons. Aided by this model,…

July 24, 2025
The surprising strength of weak classifiers for validating neural posterior estimates

The surprising strength of weak classifiers for validating neural posterior estimates arXiv:2507.17026v1 Announce Type: new Abstract: Neural Posterior Estimation (NPE) has emerged as a powerful approach for amortized Bayesian inference when the true posterior $p(theta mid y)$ is intractable or difficult to sample. But evaluating the accuracy of neural posterior estimates remains challenging, with existing…

July 24, 2025
CoLT: The conditional localization test for assessing the accuracy of neural posterior estimates

CoLT: The conditional localization test for assessing the accuracy of neural posterior estimates arXiv:2507.17030v1 Announce Type: new Abstract: We consider the problem of validating whether a neural posterior estimate ( q(theta mid x) ) is an accurate approximation to the true, unknown true posterior ( p(theta mid x) ). Existing methods for evaluating the quality…

July 24, 2025
Nearly Minimax Discrete Distribution Estimation in Kullback-Leibler Divergence with High Probability

Nearly Minimax Discrete Distribution Estimation in Kullback-Leibler Divergence with High Probability arXiv:2507.17316v1 Announce Type: new Abstract: We consider the problem of estimating a discrete distribution $p$ with support of size $K$ and provide both upper and lower bounds with high probability in KL divergence. We prove that in the worst case, for any estimator $widehat{p}$,…

July 24, 2025
Structural DID with ML: Theory, Simulation, and a Roadmap for Applied Research

Structural DID with ML: Theory, Simulation, and a Roadmap for Applied Research arXiv:2507.15899v1 Announce Type: new Abstract: Causal inference in observational panel data has become a central concern in economics,policy analysis,and the broader social sciences.To address the core contradiction where traditional difference-in-differences (DID) struggles with high-dimensional confounding variables in observational panel data,while machine learning (ML)…

July 23, 2025
Generative AI Models for Learning Flow Maps of Stochastic Dynamical Systems in Bounded Domains

Generative AI Models for Learning Flow Maps of Stochastic Dynamical Systems in Bounded Domains arXiv:2507.15990v1 Announce Type: new Abstract: Simulating stochastic differential equations (SDEs) in bounded domains, presents significant computational challenges due to particle exit phenomena, which requires accurate modeling of interior stochastic dynamics and boundary interactions. Despite the success of machine learning-based methods in…

July 23, 2025
Estimating Treatment Effects with Independent Component Analysis

Estimating Treatment Effects with Independent Component Analysis arXiv:2507.16467v1 Announce Type: new Abstract: The field of causal inference has developed a variety of methods to accurately estimate treatment effects in the presence of nuisance. Meanwhile, the field of identifiability theory has developed methods like Independent Component Analysis (ICA) to identify latent sources and mixing weights from…

July 23, 2025
PAC Off-Policy Prediction of Contextual Bandits

PAC Off-Policy Prediction of Contextual Bandits arXiv:2507.16236v1 Announce Type: new Abstract: This paper investigates off-policy evaluation in contextual bandits, aiming to quantify the performance of a target policy using data collected under a different and potentially unknown behavior policy. Recently, methods based on conformal prediction have been developed to construct reliable prediction intervals that guarantee…

July 23, 2025
Structural Effect and Spectral Enhancement of High-Dimensional Regularized Linear Discriminant Analysis

Structural Effect and Spectral Enhancement of High-Dimensional Regularized Linear Discriminant Analysis arXiv:2507.16682v1 Announce Type: new Abstract: Regularized linear discriminant analysis (RLDA) is a widely used tool for classification and dimensionality reduction, but its performance in high-dimensional scenarios is inconsistent. Existing theoretical analyses of RLDA often lack clear insight into how data structure affects classification performance.…

July 23, 2025
Statistical and Algorithmic Foundations of Reinforcement Learning

Statistical and Algorithmic Foundations of Reinforcement Learning arXiv:2507.14444v1 Announce Type: new Abstract: As a paradigm for sequential decision making in unknown environments, reinforcement learning (RL) has received a flurry of attention in recent years. However, the explosion of model complexity in emerging applications and the presence of nonconvexity exacerbate the challenge of achieving efficient RL…

July 22, 2025
Diffusion Models for Time Series Forecasting: A Survey

Diffusion Models for Time Series Forecasting: A Survey arXiv:2507.14507v1 Announce Type: new Abstract: Diffusion models, initially developed for image synthesis, demonstrate remarkable generative capabilities. Recently, their application has expanded to time series forecasting (TSF), yielding promising results. In this survey, we firstly introduce the standard diffusion models and their prevalent variants, explaining their adaptation to…

July 22, 2025
Deep Learning-Based Survival Analysis with Copula-Based Activation Functions for Multivariate Response Prediction

Deep Learning-Based Survival Analysis with Copula-Based Activation Functions for Multivariate Response Prediction arXiv:2507.14641v1 Announce Type: new Abstract: This research integrates deep learning, copula functions, and survival analysis to effectively handle highly correlated and right-censored multivariate survival data. It introduces copula-based activation functions (Clayton, Gumbel, and their combinations) to model the nonlinear dependencies inherent in such…

July 22, 2025
When few labeled target data suffice: a theory of semi-supervised domain adaptation via fine-tuning from multiple adaptive starts

When few labeled target data suffice: a theory of semi-supervised domain adaptation via fine-tuning from multiple adaptive starts arXiv:2507.14661v1 Announce Type: new Abstract: Semi-supervised domain adaptation (SSDA) aims to achieve high predictive performance in the target domain with limited labeled target data by exploiting abundant source and unlabeled target data. Despite its significance in numerous…

July 22, 2025
Accelerating Hamiltonian Monte Carlo for Bayesian Inference in Neural Networks and Neural Operators

Accelerating Hamiltonian Monte Carlo for Bayesian Inference in Neural Networks and Neural Operators arXiv:2507.14652v1 Announce Type: new Abstract: Hamiltonian Monte Carlo (HMC) is a powerful and accurate method to sample from the posterior distribution in Bayesian inference. However, HMC techniques are computationally demanding for Bayesian neural networks due to the high dimensionality of the network’s…

July 22, 2025
Differential Privacy in Kernelized Contextual Bandits via Random Projections

Differential Privacy in Kernelized Contextual Bandits via Random Projections arXiv:2507.13639v1 Announce Type: new Abstract: We consider the problem of contextual kernel bandits with stochastic contexts, where the underlying reward function belongs to a known Reproducing Kernel Hilbert Space. We study this problem under an additional constraint of Differential Privacy, where the agent needs to ensure…

July 21, 2025
Conformal Data Contamination Tests for Trading or Sharing of Data

Conformal Data Contamination Tests for Trading or Sharing of Data arXiv:2507.13835v1 Announce Type: new Abstract: The amount of quality data in many machine learning tasks is limited to what is available locally to data owners. The set of quality data can be expanded through trading or sharing with external data agents. However, data buyers need…

July 21, 2025
A Survey of Dimension Estimation Methods

A Survey of Dimension Estimation Methods arXiv:2507.13887v1 Announce Type: new Abstract: It is a standard assumption that datasets in high dimension have an internal structure which means that they in fact lie on, or near, subsets of a lower dimension. In many instances it is important to understand the real dimension of the data, hence…

July 21, 2025
Step-DAD: Semi-Amortized Policy-Based Bayesian Experimental Design

Step-DAD: Semi-Amortized Policy-Based Bayesian Experimental Design arXiv:2507.14057v1 Announce Type: new Abstract: We develop a semi-amortized, policy-based, approach to Bayesian experimental design (BED) called Stepwise Deep Adaptive Design (Step-DAD). Like existing, fully amortized, policy-based BED approaches, Step-DAD trains a design policy upfront before the experiment. However, rather than keeping this policy fixed, Step-DAD periodically updates it…

July 21, 2025
Conformalized Regression for Continuous Bounded Outcomes

Conformalized Regression for Continuous Bounded Outcomes arXiv:2507.14023v1 Announce Type: new Abstract: Regression problems with bounded continuous outcomes frequently arise in real-world statistical and machine learning applications, such as the analysis of rates and proportions. A central challenge in this setting is predicting a response associated with a new covariate value. Most of the existing statistical…

July 21, 2025
Physics constrained learning of stochastic characteristics

Physics constrained learning of stochastic characteristics arXiv:2507.12661v1 Announce Type: new Abstract: Accurate state estimation requires careful consideration of uncertainty surrounding the process and measurement models; these characteristics are usually not well-known and need an experienced designer to select the covariance matrices. An error in the selection of covariance matrices could impact the accuracy of the…

July 18, 2025
Self Balancing Neural Network: A Novel Method to Estimate Average Treatment Effect

Self Balancing Neural Network: A Novel Method to Estimate Average Treatment Effect arXiv:2507.12818v1 Announce Type: new Abstract: In observational studies, confounding variables affect both treatment and outcome. Moreover, instrumental variables also influence the treatment assignment mechanism. This situation sets the study apart from a standard randomized controlled trial, where the treatment assignment is random. Due…

July 18, 2025
Finite-Dimensional Gaussian Approximation for Deep Neural Networks: Universality in Random Weights

Finite-Dimensional Gaussian Approximation for Deep Neural Networks: Universality in Random Weights arXiv:2507.12686v1 Announce Type: new Abstract: We study the Finite-Dimensional Distributions (FDDs) of deep neural networks with randomly initialized weights that have finite-order moments. Specifically, we establish Gaussian approximation bounds in the Wasserstein-$1$ norm between the FDDs and their Gaussian limit assuming a Lipschitz activation…

July 18, 2025
Bayesian Modeling and Estimation of Linear Time-Variant Systems using Neural Networks and Gaussian Processes

Bayesian Modeling and Estimation of Linear Time-Variant Systems using Neural Networks and Gaussian Processes arXiv:2507.12878v1 Announce Type: new Abstract: The identification of Linear Time-Variant (LTV) systems from input-output data is a fundamental yet challenging ill-posed inverse problem. This work introduces a unified Bayesian framework that models the system’s impulse response, $h(t, tau)$, as a stochastic…

July 18, 2025
When Pattern-by-Pattern Works: Theoretical and Empirical Insights for Logistic Models with Missing Values

When Pattern-by-Pattern Works: Theoretical and Empirical Insights for Logistic Models with Missing Values arXiv:2507.13024v1 Announce Type: new Abstract: Predicting a response with partially missing inputs remains a challenging task even in parametric models, since parameter estimation in itself is not sufficient to predict on partially observed inputs. Several works study prediction in linear models. In…

July 18, 2025
LLMs are Bayesian, in Expectation, not in Realization

LLMs are Bayesian, in Expectation, not in Realization arXiv:2507.11768v1 Announce Type: new Abstract: Large language models demonstrate remarkable in-context learning capabilities, adapting to new tasks without parameter updates. While this phenomenon has been successfully modeled as implicit Bayesian inference, recent empirical findings reveal a fundamental contradiction: transformers systematically violate the martingale property, a cornerstone requirement…

July 17, 2025
Choosing the Better Bandit Algorithm under Data Sharing: When Do A/B Experiments Work?

Choosing the Better Bandit Algorithm under Data Sharing: When Do A/B Experiments Work? arXiv:2507.11891v1 Announce Type: new Abstract: We study A/B experiments that are designed to compare the performance of two recommendation algorithms. Prior work has shown that the standard difference-in-means estimator is biased in estimating the global treatment effect (GTE) due to a particular…

July 17, 2025
Newfluence: Boosting Model interpretability and Understanding in High Dimensions

Newfluence: Boosting Model interpretability and Understanding in High Dimensions arXiv:2507.11895v1 Announce Type: new Abstract: The increasing complexity of machine learning (ML) and artificial intelligence (AI) models has created a pressing need for tools that help scientists, engineers, and policymakers interpret and refine model decisions and predictions. Influence functions, originating from robust statistics, have emerged as…

July 17, 2025
Incorporating Fairness Constraints into Archetypal Analysis

Incorporating Fairness Constraints into Archetypal Analysis arXiv:2507.12021v1 Announce Type: new Abstract: Archetypal Analysis (AA) is an unsupervised learning method that represents data as convex combinations of extreme patterns called archetypes. While AA provides interpretable and low-dimensional representations, it can inadvertently encode sensitive attributes, leading to fairness concerns. In this work, we propose Fair Archetypal Analysis…

July 17, 2025
Distribution-Free Uncertainty-Aware Virtual Sensing via Conformalized Neural Operators

Distribution-Free Uncertainty-Aware Virtual Sensing via Conformalized Neural Operators arXiv:2507.11574v1 Announce Type: cross Abstract: Robust uncertainty quantification (UQ) remains a critical barrier to the safe deployment of deep learning in real-time virtual sensing, particularly in high-stakes domains where sparse, noisy, or non-collocated sensor data are the norm. We introduce the Conformalized Monte Carlo Operator (CMCO), a…

July 17, 2025
TaylorPODA: A Taylor Expansion-Based Method to Improve Post-Hoc Attributions for Opaque Models

TaylorPODA: A Taylor Expansion-Based Method to Improve Post-Hoc Attributions for Opaque Models arXiv:2507.10643v1 Announce Type: new Abstract: Existing post-hoc model-agnostic methods generate external explanations for opaque models, primarily by locally attributing the model output to its input features. However, they often lack an explicit and systematic framework for quantifying the contribution of individual features. Building…

July 16, 2025
Robust Multi-Manifold Clustering via Simplex Paths

Robust Multi-Manifold Clustering via Simplex Paths arXiv:2507.10710v1 Announce Type: new Abstract: This article introduces a novel, geometric approach for multi-manifold clustering (MMC), i.e. for clustering a collection of potentially intersecting, d-dimensional manifolds into the individual manifold components. We first compute a locality graph on d-simplices, using the dihedral angle in between adjacent simplices as the…

July 16, 2025
GOLFS: Feature Selection via Combining Both Global and Local Information for High Dimensional Clustering

GOLFS: Feature Selection via Combining Both Global and Local Information for High Dimensional Clustering arXiv:2507.10956v1 Announce Type: new Abstract: It is important to identify the discriminative features for high dimensional clustering. However, due to the lack of cluster labels, the regularization methods developed for supervised feature selection can not be directly applied. To learn the…

July 16, 2025
How does Labeling Error Impact Contrastive Learning? A Perspective from Data Dimensionality Reduction

How does Labeling Error Impact Contrastive Learning? A Perspective from Data Dimensionality Reduction arXiv:2507.11161v1 Announce Type: new Abstract: In recent years, contrastive learning has achieved state-of-the-art performance in the territory of self-supervised representation learning. Many previous works have attempted to provide the theoretical understanding underlying the success of contrastive learning. Almost all of them rely…

July 16, 2025
Interpretable Bayesian Tensor Network Kernel Machines with Automatic Rank and Feature Selection

Interpretable Bayesian Tensor Network Kernel Machines with Automatic Rank and Feature Selection arXiv:2507.11136v1 Announce Type: new Abstract: Tensor Network (TN) Kernel Machines speed up model learning by representing parameters as low-rank TNs, reducing computation and memory use. However, most TN-based Kernel methods are deterministic and ignore parameter uncertainty. Further, they require manual tuning of model…

July 16, 2025
The Bayesian Approach to Continual Learning: An Overview

The Bayesian Approach to Continual Learning: An Overview arXiv:2507.08922v1 Announce Type: new Abstract: Continual learning is an online paradigm where a learner continually accumulates knowledge from different tasks encountered over sequential time steps. Importantly, the learner is required to extend and update its knowledge without forgetting about the learning experience acquired from the past, and…

July 15, 2025
Physics-informed machine learning: A mathematical framework with applications to time series forecasting

Physics-informed machine learning: A mathematical framework with applications to time series forecasting arXiv:2507.08906v1 Announce Type: new Abstract: Physics-informed machine learning (PIML) is an emerging framework that integrates physical knowledge into machine learning models. This physical prior often takes the form of a partial differential equation (PDE) system that the regression function must satisfy. In the…

July 15, 2025
Optimal High-probability Convergence of Nonlinear SGD under Heavy-tailed Noise via Symmetrization

Optimal High-probability Convergence of Nonlinear SGD under Heavy-tailed Noise via Symmetrization arXiv:2507.09093v1 Announce Type: new Abstract: We study convergence in high-probability of SGD-type methods in non-convex optimization and the presence of heavy-tailed noise. To combat the heavy-tailed noise, a general black-box nonlinear framework is considered, subsuming nonlinearities like sign, clipping, normalization and their smooth counterparts.…

July 15, 2025
Fixed-Confidence Multiple Change Point Identification under Bandit Feedback

Fixed-Confidence Multiple Change Point Identification under Bandit Feedback arXiv:2507.08994v1 Announce Type: new Abstract: Piecewise constant functions describe a variety of real-world phenomena in domains ranging from chemistry to manufacturing. In practice, it is often required to confidently identify the locations of the abrupt changes in these functions as quickly as possible. For this, we introduce…

July 15, 2025
CoVAE: Consistency Training of Variational Autoencoders

CoVAE: Consistency Training of Variational Autoencoders arXiv:2507.09103v1 Announce Type: new Abstract: Current state-of-the-art generative approaches frequently rely on a two-stage training procedure, where an autoencoder (often a VAE) first performs dimensionality reduction, followed by training a generative model on the learned latent space. While effective, this introduces computational overhead and increased sampling times. We challenge…

July 15, 2025
Mallows Model with Learned Distance Metrics: Sampling and Maximum Likelihood Estimation

Mallows Model with Learned Distance Metrics: Sampling and Maximum Likelihood Estimation arXiv:2507.08108v1 Announce Type: new Abstract: textit{Mallows model} is a widely-used probabilistic framework for learning from ranking data, with applications ranging from recommendation systems and voting to aligning language models with human preferences~cite{chen2024mallows, kleinberg2021algorithmic, rafailov2024direct}. Under this model, observed rankings are noisy perturbations of a…

July 14, 2025
CLEAR: Calibrated Learning for Epistemic and Aleatoric Risk

CLEAR: Calibrated Learning for Epistemic and Aleatoric Risk arXiv:2507.08150v1 Announce Type: new Abstract: Accurate uncertainty quantification is critical for reliable predictive modeling, especially in regression tasks. Existing methods typically address either aleatoric uncertainty from measurement noise or epistemic uncertainty from limited data, but not necessarily both in a balanced way. We propose CLEAR, a calibration…

July 14, 2025
MIRRAMS: Towards Training Models Robust to Missingness Distribution Shifts

MIRRAMS: Towards Training Models Robust to Missingness Distribution Shifts arXiv:2507.08280v1 Announce Type: new Abstract: In real-world data analysis, missingness distributional shifts between training and test input datasets frequently occur, posing a significant challenge to achieving robust prediction performance. In this study, we propose a novel deep learning framework designed to address such shifts in missingness…

July 14, 2025
Admissibility of Stein Shrinkage for Batch Normalization in the Presence of Adversarial Attacks

Admissibility of Stein Shrinkage for Batch Normalization in the Presence of Adversarial Attacks arXiv:2507.08261v1 Announce Type: new Abstract: Batch normalization (BN) is a ubiquitous operation in deep neural networks used primarily to achieve stability and regularization during network training. BN involves feature map centering and scaling using sample means and variances, respectively. Since these statistics…

July 14, 2025
Optimal and Practical Batched Linear Bandit Algorithm

Optimal and Practical Batched Linear Bandit Algorithm arXiv:2507.08438v1 Announce Type: new Abstract: We study the linear bandit problem under limited adaptivity, known as the batched linear bandit. While existing approaches can achieve near-optimal regret in theory, they are often computationally prohibitive or underperform in practice. We propose texttt{BLAE}, a novel batched algorithm that integrates arm…

July 14, 2025
Topological Machine Learning with Unreduced Persistence Diagrams

Topological Machine Learning with Unreduced Persistence Diagrams arXiv:2507.07156v1 Announce Type: new Abstract: Supervised machine learning pipelines trained on features derived from persistent homology have been experimentally observed to ignore much of the information contained in a persistence diagram. Computing persistence diagrams is often the most computationally demanding step in such a pipeline, however. To explore…

July 11, 2025
Class conditional conformal prediction for multiple inputs by p-value aggregation

Class conditional conformal prediction for multiple inputs by p-value aggregation arXiv:2507.07150v1 Announce Type: new Abstract: Conformal prediction methods are statistical tools designed to quantify uncertainty and generate predictive sets with guaranteed coverage probabilities. This work introduces an innovative refinement to these methods for classification tasks, specifically tailored for scenarios where multiple observations (multi-inputs) of a…

July 11, 2025
Bayesian Double Descent

Bayesian Double Descent arXiv:2507.07338v1 Announce Type: new Abstract: Double descent is a phenomenon of over-parameterized statistical models. Our goal is to view double descent from a Bayesian perspective. Over-parameterized models such as deep neural networks have an interesting re-descending property in their risk characteristics. This is a recent phenomenon in machine learning and has been…

July 11, 2025
Hess-MC2: Sequential Monte Carlo Squared using Hessian Information and Second Order Proposals

Hess-MC2: Sequential Monte Carlo Squared using Hessian Information and Second Order Proposals arXiv:2507.07461v1 Announce Type: new Abstract: When performing Bayesian inference using Sequential Monte Carlo (SMC) methods, two considerations arise: the accuracy of the posterior approximation and computational efficiency. To address computational demands, Sequential Monte Carlo Squared (SMC$^2$) is well-suited for high-performance computing (HPC) environments.…

July 11, 2025
Galerkin-ARIMA: A Two-Stage Polynomial Regression Framework for Fast Rolling One-Step-Ahead Forecasting

Galerkin-ARIMA: A Two-Stage Polynomial Regression Framework for Fast Rolling One-Step-Ahead Forecasting arXiv:2507.07469v1 Announce Type: new Abstract: Time-series models like ARIMA remain widely used for forecasting but limited to linear assumptions and high computational cost in large and complex datasets. We propose Galerkin-ARIMA that generalizes the AR component of ARIMA and replace it with a flexible…

July 11, 2025
On the Hardness of Unsupervised Domain Adaptation: Optimal Learners and Information-Theoretic Perspective

On the Hardness of Unsupervised Domain Adaptation: Optimal Learners and Information-Theoretic Perspective arXiv:2507.06552v1 Announce Type: new Abstract: This paper studies the hardness of unsupervised domain adaptation (UDA) under covariate shift. We model the uncertainty that the learner faces by a distribution $pi$ in the ground-truth triples $(p, q, f)$ — which we call a UDA…

July 10, 2025
Semi-parametric Functional Classification via Path Signatures Logistic Regression

Semi-parametric Functional Classification via Path Signatures Logistic Regression arXiv:2507.06637v1 Announce Type: new Abstract: We propose Path Signatures Logistic Regression (PSLR), a semi-parametric framework for classifying vector-valued functional data with scalar covariates. Classical functional logistic regression models rely on linear assumptions and fixed basis expansions, which limit flexibility and degrade performance under irregular sampling. PSLR overcomes…

July 10, 2025
Fast Gaussian Processes under Monotonicity Constraints

Fast Gaussian Processes under Monotonicity Constraints arXiv:2507.06677v1 Announce Type: new Abstract: Gaussian processes (GPs) are widely used as surrogate models for complicated functions in scientific and engineering applications. In many cases, prior knowledge about the function to be approximated, such as monotonicity, is available and can be leveraged to improve model fidelity. Incorporating such constraints…

July 10, 2025
Conformal Prediction for Long-Tailed Classification

Conformal Prediction for Long-Tailed Classification arXiv:2507.06867v1 Announce Type: new Abstract: Many real-world classification problems, such as plant identification, have extremely long-tailed class distributions. In order for prediction sets to be useful in such settings, they should (i) provide good class-conditional coverage, ensuring that rare classes are not systematically omitted from the prediction sets, and (ii)…

July 10, 2025
Adaptive collaboration for online personalized distributed learning with heterogeneous clients

Adaptive collaboration for online personalized distributed learning with heterogeneous clients arXiv:2507.06844v1 Announce Type: new Abstract: We study the problem of online personalized decentralized learning with $N$ statistically heterogeneous clients collaborating to accelerate local training. An important challenge in this setting is to select relevant collaborators to reduce gradient variance while mitigating the introduced bias. To…

July 10, 2025
Temporal Conformal Prediction (TCP): A Distribution-Free Statistical and Machine Learning Framework for Adaptive Risk Forecasting

Temporal Conformal Prediction (TCP): A Distribution-Free Statistical and Machine Learning Framework for Adaptive Risk Forecasting arXiv:2507.05470v1 Announce Type: new Abstract: We propose Temporal Conformal Prediction (TCP), a novel framework for constructing prediction intervals in financial time-series with guaranteed finite-sample validity. TCP integrates quantile regression with a conformal calibration layer that adapts online via a decaying…

July 9, 2025
Enjoying Non-linearity in Multinomial Logistic Bandits

Enjoying Non-linearity in Multinomial Logistic Bandits arXiv:2507.05306v1 Announce Type: new Abstract: We consider the multinomial logistic bandit problem, a variant of generalized linear bandits where a learner interacts with an environment by selecting actions to maximize expected rewards based on probabilistic feedback from multiple possible outcomes. In the binary setting, recent work has focused on…

July 9, 2025
A Malliavin calculus approach to score functions in diffusion generative models

A Malliavin calculus approach to score functions in diffusion generative models arXiv:2507.05550v1 Announce Type: new Abstract: Score-based diffusion generative models have recently emerged as a powerful tool for modelling complex data distributions. These models aim at learning the score function, which defines a map from a known probability distribution to the target data distribution via…

July 9, 2025
Property Elicitation on Imprecise Probabilities

Property Elicitation on Imprecise Probabilities arXiv:2507.05857v1 Announce Type: new Abstract: Property elicitation studies which attributes of a probability distribution can be determined by minimising a risk. We investigate a generalisation of property elicitation to imprecise probabilities (IP). This investigation is motivated by multi-distribution learning, which takes the classical machine learning paradigm of minimising a single…

July 9, 2025
Best-of-N through the Smoothing Lens: KL Divergence and Regret Analysis

Best-of-N through the Smoothing Lens: KL Divergence and Regret Analysis arXiv:2507.05913v1 Announce Type: new Abstract: A simple yet effective method for inference-time alignment of generative models is Best-of-$N$ (BoN), where $N$ outcomes are sampled from a reference policy, evaluated using a proxy reward model, and the highest-scoring one is selected. While prior work argues that…

July 9, 2025