Category: cs.LG
-
Hedging with memory: shallow and deep learning with signatures
Hedging with memory: shallow and deep learning with signatures arXiv:2508.02759v1 Announce Type: new Abstract: We investigate the use of path signatures in a machine learning context for hedging exotic derivatives under non-Markovian stochastic volatility models. In a deep learning setting, we use signatures as features in feedforward neural networks and show that they outperform LSTMs…
-
Supervised Dynamic Dimension Reduction with Deep Neural Network
Supervised Dynamic Dimension Reduction with Deep Neural Network arXiv:2508.03546v1 Announce Type: new Abstract: This paper studies the problem of dimension reduction, tailored to improving time series forecasting with high-dimensional predictors. We propose a novel Supervised Deep Dynamic Principal component analysis (SDDP) framework that incorporates the target variable and lagged observations into the factor extraction process.…
-
Likelihood Matching for Diffusion Models
Likelihood Matching for Diffusion Models arXiv:2508.03636v1 Announce Type: new Abstract: We propose a Likelihood Matching approach for training diffusion models by first establishing an equivalence between the likelihood of the target data distribution and a likelihood along the sample path of the reverse diffusion. To efficiently compute the reverse sample likelihood, a quasi-likelihood is considered…
-
Learning quadratic neural networks in high dimensions: SGD dynamics and scaling laws
Learning quadratic neural networks in high dimensions: SGD dynamics and scaling laws arXiv:2508.03688v1 Announce Type: new Abstract: We study the optimization and sample complexity of gradient-based training of a two-layer neural network with quadratic activation function in the high-dimensional regime, where the data is generated as $y propto sum_{j=1}^{r}lambda_j sigmaleft(langle boldsymbol{theta_j}, boldsymbol{x}rangleright), boldsymbol{x} sim N(0,boldsymbol{I}_d)$,…
-
Uncertainty Quantification for Large-Scale Deep Networks via Post-StoNet Modeling
Uncertainty Quantification for Large-Scale Deep Networks via Post-StoNet Modeling arXiv:2508.01217v1 Announce Type: new Abstract: Deep learning has revolutionized modern data science. However, how to accurately quantify the uncertainty of predictions from large-scale deep neural networks (DNNs) remains an unresolved issue. To address this issue, we introduce a novel post-processing approach. This approach feeds the output…
-
Inequalities for Optimization of Classification Algorithms: A Perspective Motivated by Diagnostic Testing
Inequalities for Optimization of Classification Algorithms: A Perspective Motivated by Diagnostic Testing arXiv:2508.01065v1 Announce Type: new Abstract: Motivated by canonical problems in medical diagnostics, we propose and study properties of an objective function that uniformly bounds uncertainties in quantities of interest extracted from classifiers and related data analysis tools. We begin by adopting a set-theoretic…
-
Flow IV: Counterfactual Inference In Nonseparable Outcome Models Using Instrumental Variables
Flow IV: Counterfactual Inference In Nonseparable Outcome Models Using Instrumental Variables arXiv:2508.01321v1 Announce Type: new Abstract: To reach human level intelligence, learning algorithms need to incorporate causal reasoning. But identifying causality, and particularly counterfactual reasoning, remains an elusive task. In this paper, we make progress on this task by utilizing instrumental variables (IVs). IVs are…
-
Debiasing Machine Learning Predictions for Causal Inference Without Additional Ground Truth Data: “One Map, Many Trials” in Satellite-Driven Poverty Analysis
Debiasing Machine Learning Predictions for Causal Inference Without Additional Ground Truth Data: “One Map, Many Trials” in Satellite-Driven Poverty Analysis arXiv:2508.01341v1 Announce Type: new Abstract: Machine learning models trained on Earth observation data, such as satellite imagery, have demonstrated significant promise in predicting household-level wealth indices, enabling the creation of high-resolution wealth maps that can…
-
Efficient optimization of expensive black-box simulators via marginal means, with application to neutrino detector design
Efficient optimization of expensive black-box simulators via marginal means, with application to neutrino detector design arXiv:2508.01834v1 Announce Type: new Abstract: With advances in scientific computing, computer experiments are increasingly used for optimizing complex systems. However, for modern applications, e.g., the optimization of nuclear physics detectors, each experiment run can require hundreds of CPU hours, making…
-
funOCLUST: Clustering Functional Data with Outliers
funOCLUST: Clustering Functional Data with Outliers arXiv:2508.00110v1 Announce Type: new Abstract: Functional data present unique challenges for clustering due to their infinite-dimensional nature and potential sensitivity to outliers. An extension of the OCLUST algorithm to the functional setting is proposed to address these issues. The approach leverages the OCLUST framework, creating a robust method to…
-
Sinusoidal Approximation Theorem for Kolmogorov-Arnold Networks
Sinusoidal Approximation Theorem for Kolmogorov-Arnold Networks arXiv:2508.00247v1 Announce Type: new Abstract: The Kolmogorov-Arnold representation theorem states that any continuous multivariable function can be exactly represented as a finite superposition of continuous single variable functions. Subsequent simplifications of this representation involve expressing these functions as parameterized sums of a smaller number of unique monotonic functions. These…
-
DO-EM: Density Operator Expectation Maximization
DO-EM: Density Operator Expectation Maximization arXiv:2507.22786v1 Announce Type: cross Abstract: Density operators, quantum generalizations of probability distributions, are gaining prominence in machine learning due to their foundational role in quantum computing. Generative modeling based on density operator models (textbf{DOMs}) is an emerging field, but existing training algorithms — such as those for the Quantum Boltzmann…
-
Regime-Aware Conditional Neural Processes with Multi-Criteria Decision Support for Operational Electricity Price Forecasting
Regime-Aware Conditional Neural Processes with Multi-Criteria Decision Support for Operational Electricity Price Forecasting arXiv:2508.00040v1 Announce Type: cross Abstract: This work integrates Bayesian regime detection with conditional neural processes for 24-hour electricity price prediction in the German market. Our methodology integrates regime detection using a disentangled sticky hierarchical Dirichlet process hidden Markov model (DS-HDP-HMM) applied to…
-
A Smoothing Newton Method for Rank-one Matrix Recovery
A Smoothing Newton Method for Rank-one Matrix Recovery arXiv:2507.23017v1 Announce Type: new Abstract: We consider the phase retrieval problem, which involves recovering a rank-one positive semidefinite matrix from rank-one measurements. A recently proposed algorithm based on Bures-Wasserstein gradient descent (BWGD) exhibits superlinear convergence, but it is unstable, and existing theory can only prove local linear…
-
Optimal Transport Learning: Balancing Value Optimization and Fairness in Individualized Treatment Rules
Optimal Transport Learning: Balancing Value Optimization and Fairness in Individualized Treatment Rules arXiv:2507.23349v1 Announce Type: new Abstract: Individualized treatment rules (ITRs) have gained significant attention due to their wide-ranging applications in fields such as precision medicine, ridesharing, and advertising recommendations. However, when ITRs are influenced by sensitive attributes such as race, gender, or age, they…
-
DICOM De-Identification via Hybrid AI and Rule-Based Framework for Scalable, Uncertainty-Aware Redaction
DICOM De-Identification via Hybrid AI and Rule-Based Framework for Scalable, Uncertainty-Aware Redaction arXiv:2507.23736v1 Announce Type: new Abstract: Access to medical imaging and associated text data has the potential to drive major advances in healthcare research and patient outcomes. However, the presence of Protected Health Information (PHI) and Personally Identifiable Information (PII) in Digital Imaging and…
-
Scaled Beta Models and Feature Dilution for Dynamic Ticket Pricing
Scaled Beta Models and Feature Dilution for Dynamic Ticket Pricing arXiv:2507.23767v1 Announce Type: new Abstract: A novel approach is presented for identifying distinct signatures of performing acts in the secondary ticket resale market by analyzing dynamic pricing distributions. Using a newly curated, time series dataset from the SeatGeek API, we model ticket pricing distributions as…
-
Formal Bayesian Transfer Learning via the Total Risk Prior
Formal Bayesian Transfer Learning via the Total Risk Prior arXiv:2507.23768v1 Announce Type: new Abstract: In analyses with severe data-limitations, augmenting the target dataset with information from ancillary datasets in the application domain, called source datasets, can lead to significantly improved statistical procedures. However, existing methods for this transfer learning struggle to deal with situations where…
-
Simulating Posterior Bayesian Neural Networks with Dependent Weights
Simulating Posterior Bayesian Neural Networks with Dependent Weights arXiv:2507.22095v1 Announce Type: new Abstract: In this paper we consider posterior Bayesian fully connected and feedforward deep neural networks with dependent weights. Particularly, if the likelihood is Gaussian, we identify the distribution of the wide width limit and provide an algorithm to sample from the network. In…
-
Stacked SVD or SVD stacked? A Random Matrix Theory perspective on data integration
Stacked SVD or SVD stacked? A Random Matrix Theory perspective on data integration arXiv:2507.22170v1 Announce Type: new Abstract: Modern data analysis increasingly requires identifying shared latent structure across multiple high-dimensional datasets. A commonly used model assumes that the data matrices are noisy observations of low-rank matrices with a shared singular subspace. In this case, two…
-
LVM-GP: Uncertainty-Aware PDE Solver via coupling latent variable model and Gaussian process
LVM-GP: Uncertainty-Aware PDE Solver via coupling latent variable model and Gaussian process arXiv:2507.22493v1 Announce Type: new Abstract: We propose a novel probabilistic framework, termed LVM-GP, for uncertainty quantification in solving forward and inverse partial differential equations (PDEs) with noisy data. The core idea is to construct a stochastic mapping from the input to a high-dimensional…
-
Subgrid BoostCNN: Efficient Boosting of Convolutional Networks via Gradient-Guided Feature Selection
Subgrid BoostCNN: Efficient Boosting of Convolutional Networks via Gradient-Guided Feature Selection arXiv:2507.22842v1 Announce Type: new Abstract: Convolutional Neural Networks (CNNs) have achieved remarkable success across a wide range of machine learning tasks by leveraging hierarchical feature learning through deep architectures. However, the large number of layers and millions of parameters often make CNNs computationally expensive…
-
A Unified Analysis of Generalization and Sample Complexity for Semi-Supervised Domain Adaptation
A Unified Analysis of Generalization and Sample Complexity for Semi-Supervised Domain Adaptation arXiv:2507.22632v1 Announce Type: new Abstract: Domain adaptation seeks to leverage the abundant label information in a source domain to improve classification performance in a target domain with limited labels. While the field has seen extensive methodological development, its theoretical foundations remain relatively underexplored.…
-
Graph neural networks for residential location choice: connection to classical logit models
Graph neural networks for residential location choice: connection to classical logit models arXiv:2507.21334v1 Announce Type: new Abstract: Researchers have adopted deep learning for classical discrete choice analysis as it can capture complex feature relationships and achieve higher predictive performance. However, the existing deep learning approaches cannot explicitly capture the relationship among choice alternatives, which has…
-
From Sublinear to Linear: Fast Convergence in Deep Networks via Locally Polyak-Lojasiewicz Regions
From Sublinear to Linear: Fast Convergence in Deep Networks via Locally Polyak-Lojasiewicz Regions arXiv:2507.21429v1 Announce Type: new Abstract: The convergence of gradient descent (GD) on the non-convex loss landscapes of deep neural networks (DNNs) presents a fundamental theoretical challenge. While recent work has established that GD converges to a stationary point at a sublinear rate…
-
From Global to Local: A Scalable Benchmark for Local Posterior Sampling
From Global to Local: A Scalable Benchmark for Local Posterior Sampling arXiv:2507.21449v1 Announce Type: new Abstract: Degeneracy is an inherent feature of the loss landscape of neural networks, but it is not well understood how stochastic gradient MCMC (SGMCMC) algorithms interact with this degeneracy. In particular, current global convergence guarantees for common SGMCMC algorithms rely…
-
Measuring Sample Quality with Copula Discrepancies
Measuring Sample Quality with Copula Discrepancies arXiv:2507.21434v1 Announce Type: new Abstract: The scalable Markov chain Monte Carlo (MCMC) algorithms that underpin modern Bayesian machine learning, such as Stochastic Gradient Langevin Dynamics (SGLD), sacrifice asymptotic exactness for computational speed, creating a critical diagnostic gap: traditional sample quality measures fail catastrophically when applied to biased samplers. While…
-
Stochastic forest transition model dynamics and parameter estimation via deep learning
Stochastic forest transition model dynamics and parameter estimation via deep learning arXiv:2507.21486v1 Announce Type: new Abstract: Forest transitions, characterized by dynamic shifts between forest, agricultural, and abandoned lands, are complex phenomena. This study developed a stochastic differential equation model to capture the intricate dynamics of these transitions. We established the existence of global positive solutions…
-
Bayesian symbolic regression: Automated equation discovery from a physicists’ perspective
Bayesian symbolic regression: Automated equation discovery from a physicists’ perspective arXiv:2507.19540v1 Announce Type: new Abstract: Symbolic regression automates the process of learning closed-form mathematical models from data. Standard approaches to symbolic regression, as well as newer deep learning approaches, rely on heuristic model selection criteria, heuristic regularization, and heuristic exploration of model space. Here, we…
-
Adaptive Bayesian Data-Driven Design of Reliable Solder Joints for Micro-electronic Devices
Adaptive Bayesian Data-Driven Design of Reliable Solder Joints for Micro-electronic Devices arXiv:2507.19663v1 Announce Type: new Abstract: Solder joint reliability related to failures due to thermomechanical loading is a critically important yet physically complex engineering problem. As a result, simulated behavior is oftentimes computationally expensive. In an increasingly data-driven world, the usage of efficient data-driven design…
-
Sparse-mode Dynamic Mode Decomposition for Disambiguating Local and Global Structures
Sparse-mode Dynamic Mode Decomposition for Disambiguating Local and Global Structures arXiv:2507.19787v1 Announce Type: new Abstract: The dynamic mode decomposition (DMD) is a data-driven approach that extracts the dominant features from spatiotemporal data. In this work, we introduce sparse-mode DMD, a new variant of the optimized DMD framework that specifically leverages sparsity-promoting regularization in order to…
-
Bag of Coins: A Statistical Probe into Neural Confidence Structures
Bag of Coins: A Statistical Probe into Neural Confidence Structures arXiv:2507.19774v1 Announce Type: new Abstract: Modern neural networks, despite their high accuracy, often produce poorly calibrated confidence scores, limiting their reliability in high-stakes applications. Existing calibration methods typically post-process model outputs without interrogating the internal consistency of the predictions themselves. In this work, we introduce…
-
Predicting Parkinson’s Disease Progression Using Statistical and Neural Mixed Effects Models: A Comparative Study on Longitudinal Biomarkers
Predicting Parkinson’s Disease Progression Using Statistical and Neural Mixed Effects Models: A Comparative Study on Longitudinal Biomarkers arXiv:2507.20058v1 Announce Type: new Abstract: Predicting Parkinson’s Disease (PD) progression is crucial, and voice biomarkers offer a non-invasive method for tracking symptom severity (UPDRS scores) through telemonitoring. Analyzing this longitudinal data is challenging due to within-subject correlations and…
-
Central limit theorems for the eigenvalues of graph Laplacians on data clouds
Central limit theorems for the eigenvalues of graph Laplacians on data clouds arXiv:2507.18803v1 Announce Type: new Abstract: Given i.i.d. samples $X_n ={ x_1, dots, x_n }$ from a distribution supported on a low dimensional manifold ${M}$ embedded in Eucliden space, we consider the graph Laplacian operator $Delta_n$ associated to an $varepsilon$-proximity graph over $X_n$ and…
-
Perfect Clustering in Very Sparse Diverse Multiplex Networks
Perfect Clustering in Very Sparse Diverse Multiplex Networks arXiv:2507.19423v1 Announce Type: new Abstract: The paper studies the DIverse MultiPLEx Signed Generalized Random Dot Product Graph (DIMPLE-SGRDPG) network model (Pensky (2024)), where all layers of the network have the same collection of nodes. In addition, all layers can be partitioned into groups such that the layers…
-
Probably Approximately Correct Causal Discovery
Probably Approximately Correct Causal Discovery arXiv:2507.18903v1 Announce Type: new Abstract: The discovery of causal relationships is a foundational problem in artificial intelligence, statistics, epidemiology, economics, and beyond. While elegant theories exist for accurate causal discovery given infinite data, real-world applications are inherently resource-constrained. Effective methods for inferring causal relationships from observational data must perform well…
-
Sliding Window Informative Canonical Correlation Analysis
Sliding Window Informative Canonical Correlation Analysis arXiv:2507.17921v1 Announce Type: new Abstract: Canonical correlation analysis (CCA) is a technique for finding correlated sets of features between two datasets. In this paper, we propose a novel extension of CCA to the online, streaming data setting: Sliding Window Informative Canonical Correlation Analysis (SWICCA). Our method uses a streaming…
-
A Two-armed Bandit Framework for A/B Testing
A Two-armed Bandit Framework for A/B Testing arXiv:2507.18118v1 Announce Type: new Abstract: A/B testing is widely used in modern technology companies for policy evaluation and product deployment, with the goal of comparing the outcomes under a newly-developed policy against a standard control. Various causal inference and reinforcement learning methods developed in the literature are applicable…
-
On Reconstructing Training Data From Bayesian Posteriors and Trained Models
On Reconstructing Training Data From Bayesian Posteriors and Trained Models arXiv:2507.18372v1 Announce Type: new Abstract: Publicly releasing the specification of a model with its trained parameters means an adversary can attempt to reconstruct information about the training data via training data reconstruction attacks, a major vulnerability of modern machine learning methods. This paper makes three…
-
DriftMoE: A Mixture of Experts Approach to Handle Concept Drifts
DriftMoE: A Mixture of Experts Approach to Handle Concept Drifts arXiv:2507.18464v1 Announce Type: new Abstract: Learning from non-stationary data streams subject to concept drift requires models that can adapt on-the-fly while remaining resource-efficient. Existing adaptive ensemble methods often rely on coarse-grained adaptation mechanisms or simple voting schemes that fail to optimally leverage specialized knowledge. This…
-
Fundamental limits of distributed covariance matrix estimation via a conditional strong data processing inequality
Fundamental limits of distributed covariance matrix estimation via a conditional strong data processing inequality arXiv:2507.16953v1 Announce Type: new Abstract: Estimating high-dimensional covariance matrices is a key task across many fields. This paper explores the theoretical limits of distributed covariance estimation in a feature-split setting, where communication between agents is constrained. Specifically, we study a scenario…
-
Bayesian preference elicitation for decision support in multiobjective optimization
Bayesian preference elicitation for decision support in multiobjective optimization arXiv:2507.16999v1 Announce Type: new Abstract: We present a novel approach to help decision-makers efficiently identify preferred solutions from the Pareto set of a multi-objective optimization problem. Our method uses a Bayesian model to estimate the decision-maker’s utility function based on pairwise comparisons. Aided by this model,…
-
The surprising strength of weak classifiers for validating neural posterior estimates
The surprising strength of weak classifiers for validating neural posterior estimates arXiv:2507.17026v1 Announce Type: new Abstract: Neural Posterior Estimation (NPE) has emerged as a powerful approach for amortized Bayesian inference when the true posterior $p(theta mid y)$ is intractable or difficult to sample. But evaluating the accuracy of neural posterior estimates remains challenging, with existing…
-
CoLT: The conditional localization test for assessing the accuracy of neural posterior estimates
CoLT: The conditional localization test for assessing the accuracy of neural posterior estimates arXiv:2507.17030v1 Announce Type: new Abstract: We consider the problem of validating whether a neural posterior estimate ( q(theta mid x) ) is an accurate approximation to the true, unknown true posterior ( p(theta mid x) ). Existing methods for evaluating the quality…
-
Nearly Minimax Discrete Distribution Estimation in Kullback-Leibler Divergence with High Probability
Nearly Minimax Discrete Distribution Estimation in Kullback-Leibler Divergence with High Probability arXiv:2507.17316v1 Announce Type: new Abstract: We consider the problem of estimating a discrete distribution $p$ with support of size $K$ and provide both upper and lower bounds with high probability in KL divergence. We prove that in the worst case, for any estimator $widehat{p}$,…
-
Structural DID with ML: Theory, Simulation, and a Roadmap for Applied Research
Structural DID with ML: Theory, Simulation, and a Roadmap for Applied Research arXiv:2507.15899v1 Announce Type: new Abstract: Causal inference in observational panel data has become a central concern in economics,policy analysis,and the broader social sciences.To address the core contradiction where traditional difference-in-differences (DID) struggles with high-dimensional confounding variables in observational panel data,while machine learning (ML)…
-
Generative AI Models for Learning Flow Maps of Stochastic Dynamical Systems in Bounded Domains
Generative AI Models for Learning Flow Maps of Stochastic Dynamical Systems in Bounded Domains arXiv:2507.15990v1 Announce Type: new Abstract: Simulating stochastic differential equations (SDEs) in bounded domains, presents significant computational challenges due to particle exit phenomena, which requires accurate modeling of interior stochastic dynamics and boundary interactions. Despite the success of machine learning-based methods in…
-
Estimating Treatment Effects with Independent Component Analysis
Estimating Treatment Effects with Independent Component Analysis arXiv:2507.16467v1 Announce Type: new Abstract: The field of causal inference has developed a variety of methods to accurately estimate treatment effects in the presence of nuisance. Meanwhile, the field of identifiability theory has developed methods like Independent Component Analysis (ICA) to identify latent sources and mixing weights from…
-
PAC Off-Policy Prediction of Contextual Bandits
PAC Off-Policy Prediction of Contextual Bandits arXiv:2507.16236v1 Announce Type: new Abstract: This paper investigates off-policy evaluation in contextual bandits, aiming to quantify the performance of a target policy using data collected under a different and potentially unknown behavior policy. Recently, methods based on conformal prediction have been developed to construct reliable prediction intervals that guarantee…
-
Structural Effect and Spectral Enhancement of High-Dimensional Regularized Linear Discriminant Analysis
Structural Effect and Spectral Enhancement of High-Dimensional Regularized Linear Discriminant Analysis arXiv:2507.16682v1 Announce Type: new Abstract: Regularized linear discriminant analysis (RLDA) is a widely used tool for classification and dimensionality reduction, but its performance in high-dimensional scenarios is inconsistent. Existing theoretical analyses of RLDA often lack clear insight into how data structure affects classification performance.…
-
Statistical and Algorithmic Foundations of Reinforcement Learning
Statistical and Algorithmic Foundations of Reinforcement Learning arXiv:2507.14444v1 Announce Type: new Abstract: As a paradigm for sequential decision making in unknown environments, reinforcement learning (RL) has received a flurry of attention in recent years. However, the explosion of model complexity in emerging applications and the presence of nonconvexity exacerbate the challenge of achieving efficient RL…
-
Diffusion Models for Time Series Forecasting: A Survey
Diffusion Models for Time Series Forecasting: A Survey arXiv:2507.14507v1 Announce Type: new Abstract: Diffusion models, initially developed for image synthesis, demonstrate remarkable generative capabilities. Recently, their application has expanded to time series forecasting (TSF), yielding promising results. In this survey, we firstly introduce the standard diffusion models and their prevalent variants, explaining their adaptation to…
-
Deep Learning-Based Survival Analysis with Copula-Based Activation Functions for Multivariate Response Prediction
Deep Learning-Based Survival Analysis with Copula-Based Activation Functions for Multivariate Response Prediction arXiv:2507.14641v1 Announce Type: new Abstract: This research integrates deep learning, copula functions, and survival analysis to effectively handle highly correlated and right-censored multivariate survival data. It introduces copula-based activation functions (Clayton, Gumbel, and their combinations) to model the nonlinear dependencies inherent in such…
-
When few labeled target data suffice: a theory of semi-supervised domain adaptation via fine-tuning from multiple adaptive starts
When few labeled target data suffice: a theory of semi-supervised domain adaptation via fine-tuning from multiple adaptive starts arXiv:2507.14661v1 Announce Type: new Abstract: Semi-supervised domain adaptation (SSDA) aims to achieve high predictive performance in the target domain with limited labeled target data by exploiting abundant source and unlabeled target data. Despite its significance in numerous…
-
Accelerating Hamiltonian Monte Carlo for Bayesian Inference in Neural Networks and Neural Operators
Accelerating Hamiltonian Monte Carlo for Bayesian Inference in Neural Networks and Neural Operators arXiv:2507.14652v1 Announce Type: new Abstract: Hamiltonian Monte Carlo (HMC) is a powerful and accurate method to sample from the posterior distribution in Bayesian inference. However, HMC techniques are computationally demanding for Bayesian neural networks due to the high dimensionality of the network’s…
-
Differential Privacy in Kernelized Contextual Bandits via Random Projections
Differential Privacy in Kernelized Contextual Bandits via Random Projections arXiv:2507.13639v1 Announce Type: new Abstract: We consider the problem of contextual kernel bandits with stochastic contexts, where the underlying reward function belongs to a known Reproducing Kernel Hilbert Space. We study this problem under an additional constraint of Differential Privacy, where the agent needs to ensure…
-
Conformal Data Contamination Tests for Trading or Sharing of Data
Conformal Data Contamination Tests for Trading or Sharing of Data arXiv:2507.13835v1 Announce Type: new Abstract: The amount of quality data in many machine learning tasks is limited to what is available locally to data owners. The set of quality data can be expanded through trading or sharing with external data agents. However, data buyers need…
-
A Survey of Dimension Estimation Methods
A Survey of Dimension Estimation Methods arXiv:2507.13887v1 Announce Type: new Abstract: It is a standard assumption that datasets in high dimension have an internal structure which means that they in fact lie on, or near, subsets of a lower dimension. In many instances it is important to understand the real dimension of the data, hence…
-
Step-DAD: Semi-Amortized Policy-Based Bayesian Experimental Design
Step-DAD: Semi-Amortized Policy-Based Bayesian Experimental Design arXiv:2507.14057v1 Announce Type: new Abstract: We develop a semi-amortized, policy-based, approach to Bayesian experimental design (BED) called Stepwise Deep Adaptive Design (Step-DAD). Like existing, fully amortized, policy-based BED approaches, Step-DAD trains a design policy upfront before the experiment. However, rather than keeping this policy fixed, Step-DAD periodically updates it…
-
Conformalized Regression for Continuous Bounded Outcomes
Conformalized Regression for Continuous Bounded Outcomes arXiv:2507.14023v1 Announce Type: new Abstract: Regression problems with bounded continuous outcomes frequently arise in real-world statistical and machine learning applications, such as the analysis of rates and proportions. A central challenge in this setting is predicting a response associated with a new covariate value. Most of the existing statistical…
-
Physics constrained learning of stochastic characteristics
Physics constrained learning of stochastic characteristics arXiv:2507.12661v1 Announce Type: new Abstract: Accurate state estimation requires careful consideration of uncertainty surrounding the process and measurement models; these characteristics are usually not well-known and need an experienced designer to select the covariance matrices. An error in the selection of covariance matrices could impact the accuracy of the…
-
Self Balancing Neural Network: A Novel Method to Estimate Average Treatment Effect
Self Balancing Neural Network: A Novel Method to Estimate Average Treatment Effect arXiv:2507.12818v1 Announce Type: new Abstract: In observational studies, confounding variables affect both treatment and outcome. Moreover, instrumental variables also influence the treatment assignment mechanism. This situation sets the study apart from a standard randomized controlled trial, where the treatment assignment is random. Due…
-
Finite-Dimensional Gaussian Approximation for Deep Neural Networks: Universality in Random Weights
Finite-Dimensional Gaussian Approximation for Deep Neural Networks: Universality in Random Weights arXiv:2507.12686v1 Announce Type: new Abstract: We study the Finite-Dimensional Distributions (FDDs) of deep neural networks with randomly initialized weights that have finite-order moments. Specifically, we establish Gaussian approximation bounds in the Wasserstein-$1$ norm between the FDDs and their Gaussian limit assuming a Lipschitz activation…
-
Bayesian Modeling and Estimation of Linear Time-Variant Systems using Neural Networks and Gaussian Processes
Bayesian Modeling and Estimation of Linear Time-Variant Systems using Neural Networks and Gaussian Processes arXiv:2507.12878v1 Announce Type: new Abstract: The identification of Linear Time-Variant (LTV) systems from input-output data is a fundamental yet challenging ill-posed inverse problem. This work introduces a unified Bayesian framework that models the system’s impulse response, $h(t, tau)$, as a stochastic…
-
When Pattern-by-Pattern Works: Theoretical and Empirical Insights for Logistic Models with Missing Values
When Pattern-by-Pattern Works: Theoretical and Empirical Insights for Logistic Models with Missing Values arXiv:2507.13024v1 Announce Type: new Abstract: Predicting a response with partially missing inputs remains a challenging task even in parametric models, since parameter estimation in itself is not sufficient to predict on partially observed inputs. Several works study prediction in linear models. In…
-
LLMs are Bayesian, in Expectation, not in Realization
LLMs are Bayesian, in Expectation, not in Realization arXiv:2507.11768v1 Announce Type: new Abstract: Large language models demonstrate remarkable in-context learning capabilities, adapting to new tasks without parameter updates. While this phenomenon has been successfully modeled as implicit Bayesian inference, recent empirical findings reveal a fundamental contradiction: transformers systematically violate the martingale property, a cornerstone requirement…
-
Choosing the Better Bandit Algorithm under Data Sharing: When Do A/B Experiments Work?
Choosing the Better Bandit Algorithm under Data Sharing: When Do A/B Experiments Work? arXiv:2507.11891v1 Announce Type: new Abstract: We study A/B experiments that are designed to compare the performance of two recommendation algorithms. Prior work has shown that the standard difference-in-means estimator is biased in estimating the global treatment effect (GTE) due to a particular…
-
Newfluence: Boosting Model interpretability and Understanding in High Dimensions
Newfluence: Boosting Model interpretability and Understanding in High Dimensions arXiv:2507.11895v1 Announce Type: new Abstract: The increasing complexity of machine learning (ML) and artificial intelligence (AI) models has created a pressing need for tools that help scientists, engineers, and policymakers interpret and refine model decisions and predictions. Influence functions, originating from robust statistics, have emerged as…
-
Incorporating Fairness Constraints into Archetypal Analysis
Incorporating Fairness Constraints into Archetypal Analysis arXiv:2507.12021v1 Announce Type: new Abstract: Archetypal Analysis (AA) is an unsupervised learning method that represents data as convex combinations of extreme patterns called archetypes. While AA provides interpretable and low-dimensional representations, it can inadvertently encode sensitive attributes, leading to fairness concerns. In this work, we propose Fair Archetypal Analysis…
-
Distribution-Free Uncertainty-Aware Virtual Sensing via Conformalized Neural Operators
Distribution-Free Uncertainty-Aware Virtual Sensing via Conformalized Neural Operators arXiv:2507.11574v1 Announce Type: cross Abstract: Robust uncertainty quantification (UQ) remains a critical barrier to the safe deployment of deep learning in real-time virtual sensing, particularly in high-stakes domains where sparse, noisy, or non-collocated sensor data are the norm. We introduce the Conformalized Monte Carlo Operator (CMCO), a…
-
TaylorPODA: A Taylor Expansion-Based Method to Improve Post-Hoc Attributions for Opaque Models
TaylorPODA: A Taylor Expansion-Based Method to Improve Post-Hoc Attributions for Opaque Models arXiv:2507.10643v1 Announce Type: new Abstract: Existing post-hoc model-agnostic methods generate external explanations for opaque models, primarily by locally attributing the model output to its input features. However, they often lack an explicit and systematic framework for quantifying the contribution of individual features. Building…
-
Robust Multi-Manifold Clustering via Simplex Paths
Robust Multi-Manifold Clustering via Simplex Paths arXiv:2507.10710v1 Announce Type: new Abstract: This article introduces a novel, geometric approach for multi-manifold clustering (MMC), i.e. for clustering a collection of potentially intersecting, d-dimensional manifolds into the individual manifold components. We first compute a locality graph on d-simplices, using the dihedral angle in between adjacent simplices as the…
-
GOLFS: Feature Selection via Combining Both Global and Local Information for High Dimensional Clustering
GOLFS: Feature Selection via Combining Both Global and Local Information for High Dimensional Clustering arXiv:2507.10956v1 Announce Type: new Abstract: It is important to identify the discriminative features for high dimensional clustering. However, due to the lack of cluster labels, the regularization methods developed for supervised feature selection can not be directly applied. To learn the…
-
How does Labeling Error Impact Contrastive Learning? A Perspective from Data Dimensionality Reduction
How does Labeling Error Impact Contrastive Learning? A Perspective from Data Dimensionality Reduction arXiv:2507.11161v1 Announce Type: new Abstract: In recent years, contrastive learning has achieved state-of-the-art performance in the territory of self-supervised representation learning. Many previous works have attempted to provide the theoretical understanding underlying the success of contrastive learning. Almost all of them rely…
-
Interpretable Bayesian Tensor Network Kernel Machines with Automatic Rank and Feature Selection
Interpretable Bayesian Tensor Network Kernel Machines with Automatic Rank and Feature Selection arXiv:2507.11136v1 Announce Type: new Abstract: Tensor Network (TN) Kernel Machines speed up model learning by representing parameters as low-rank TNs, reducing computation and memory use. However, most TN-based Kernel methods are deterministic and ignore parameter uncertainty. Further, they require manual tuning of model…
-
The Bayesian Approach to Continual Learning: An Overview
The Bayesian Approach to Continual Learning: An Overview arXiv:2507.08922v1 Announce Type: new Abstract: Continual learning is an online paradigm where a learner continually accumulates knowledge from different tasks encountered over sequential time steps. Importantly, the learner is required to extend and update its knowledge without forgetting about the learning experience acquired from the past, and…
-
Physics-informed machine learning: A mathematical framework with applications to time series forecasting
Physics-informed machine learning: A mathematical framework with applications to time series forecasting arXiv:2507.08906v1 Announce Type: new Abstract: Physics-informed machine learning (PIML) is an emerging framework that integrates physical knowledge into machine learning models. This physical prior often takes the form of a partial differential equation (PDE) system that the regression function must satisfy. In the…
-
Optimal High-probability Convergence of Nonlinear SGD under Heavy-tailed Noise via Symmetrization
Optimal High-probability Convergence of Nonlinear SGD under Heavy-tailed Noise via Symmetrization arXiv:2507.09093v1 Announce Type: new Abstract: We study convergence in high-probability of SGD-type methods in non-convex optimization and the presence of heavy-tailed noise. To combat the heavy-tailed noise, a general black-box nonlinear framework is considered, subsuming nonlinearities like sign, clipping, normalization and their smooth counterparts.…
-
Fixed-Confidence Multiple Change Point Identification under Bandit Feedback
Fixed-Confidence Multiple Change Point Identification under Bandit Feedback arXiv:2507.08994v1 Announce Type: new Abstract: Piecewise constant functions describe a variety of real-world phenomena in domains ranging from chemistry to manufacturing. In practice, it is often required to confidently identify the locations of the abrupt changes in these functions as quickly as possible. For this, we introduce…
-
CoVAE: Consistency Training of Variational Autoencoders
CoVAE: Consistency Training of Variational Autoencoders arXiv:2507.09103v1 Announce Type: new Abstract: Current state-of-the-art generative approaches frequently rely on a two-stage training procedure, where an autoencoder (often a VAE) first performs dimensionality reduction, followed by training a generative model on the learned latent space. While effective, this introduces computational overhead and increased sampling times. We challenge…
-
Mallows Model with Learned Distance Metrics: Sampling and Maximum Likelihood Estimation
Mallows Model with Learned Distance Metrics: Sampling and Maximum Likelihood Estimation arXiv:2507.08108v1 Announce Type: new Abstract: textit{Mallows model} is a widely-used probabilistic framework for learning from ranking data, with applications ranging from recommendation systems and voting to aligning language models with human preferences~cite{chen2024mallows, kleinberg2021algorithmic, rafailov2024direct}. Under this model, observed rankings are noisy perturbations of a…
-
CLEAR: Calibrated Learning for Epistemic and Aleatoric Risk
CLEAR: Calibrated Learning for Epistemic and Aleatoric Risk arXiv:2507.08150v1 Announce Type: new Abstract: Accurate uncertainty quantification is critical for reliable predictive modeling, especially in regression tasks. Existing methods typically address either aleatoric uncertainty from measurement noise or epistemic uncertainty from limited data, but not necessarily both in a balanced way. We propose CLEAR, a calibration…
-
MIRRAMS: Towards Training Models Robust to Missingness Distribution Shifts
MIRRAMS: Towards Training Models Robust to Missingness Distribution Shifts arXiv:2507.08280v1 Announce Type: new Abstract: In real-world data analysis, missingness distributional shifts between training and test input datasets frequently occur, posing a significant challenge to achieving robust prediction performance. In this study, we propose a novel deep learning framework designed to address such shifts in missingness…
-
Admissibility of Stein Shrinkage for Batch Normalization in the Presence of Adversarial Attacks
Admissibility of Stein Shrinkage for Batch Normalization in the Presence of Adversarial Attacks arXiv:2507.08261v1 Announce Type: new Abstract: Batch normalization (BN) is a ubiquitous operation in deep neural networks used primarily to achieve stability and regularization during network training. BN involves feature map centering and scaling using sample means and variances, respectively. Since these statistics…
-
Optimal and Practical Batched Linear Bandit Algorithm
Optimal and Practical Batched Linear Bandit Algorithm arXiv:2507.08438v1 Announce Type: new Abstract: We study the linear bandit problem under limited adaptivity, known as the batched linear bandit. While existing approaches can achieve near-optimal regret in theory, they are often computationally prohibitive or underperform in practice. We propose texttt{BLAE}, a novel batched algorithm that integrates arm…
-
Topological Machine Learning with Unreduced Persistence Diagrams
Topological Machine Learning with Unreduced Persistence Diagrams arXiv:2507.07156v1 Announce Type: new Abstract: Supervised machine learning pipelines trained on features derived from persistent homology have been experimentally observed to ignore much of the information contained in a persistence diagram. Computing persistence diagrams is often the most computationally demanding step in such a pipeline, however. To explore…
-
Class conditional conformal prediction for multiple inputs by p-value aggregation
Class conditional conformal prediction for multiple inputs by p-value aggregation arXiv:2507.07150v1 Announce Type: new Abstract: Conformal prediction methods are statistical tools designed to quantify uncertainty and generate predictive sets with guaranteed coverage probabilities. This work introduces an innovative refinement to these methods for classification tasks, specifically tailored for scenarios where multiple observations (multi-inputs) of a…
-
Bayesian Double Descent
Bayesian Double Descent arXiv:2507.07338v1 Announce Type: new Abstract: Double descent is a phenomenon of over-parameterized statistical models. Our goal is to view double descent from a Bayesian perspective. Over-parameterized models such as deep neural networks have an interesting re-descending property in their risk characteristics. This is a recent phenomenon in machine learning and has been…
-
Hess-MC2: Sequential Monte Carlo Squared using Hessian Information and Second Order Proposals
Hess-MC2: Sequential Monte Carlo Squared using Hessian Information and Second Order Proposals arXiv:2507.07461v1 Announce Type: new Abstract: When performing Bayesian inference using Sequential Monte Carlo (SMC) methods, two considerations arise: the accuracy of the posterior approximation and computational efficiency. To address computational demands, Sequential Monte Carlo Squared (SMC$^2$) is well-suited for high-performance computing (HPC) environments.…
-
Galerkin-ARIMA: A Two-Stage Polynomial Regression Framework for Fast Rolling One-Step-Ahead Forecasting
Galerkin-ARIMA: A Two-Stage Polynomial Regression Framework for Fast Rolling One-Step-Ahead Forecasting arXiv:2507.07469v1 Announce Type: new Abstract: Time-series models like ARIMA remain widely used for forecasting but limited to linear assumptions and high computational cost in large and complex datasets. We propose Galerkin-ARIMA that generalizes the AR component of ARIMA and replace it with a flexible…
-
On the Hardness of Unsupervised Domain Adaptation: Optimal Learners and Information-Theoretic Perspective
On the Hardness of Unsupervised Domain Adaptation: Optimal Learners and Information-Theoretic Perspective arXiv:2507.06552v1 Announce Type: new Abstract: This paper studies the hardness of unsupervised domain adaptation (UDA) under covariate shift. We model the uncertainty that the learner faces by a distribution $pi$ in the ground-truth triples $(p, q, f)$ — which we call a UDA…
-
Semi-parametric Functional Classification via Path Signatures Logistic Regression
Semi-parametric Functional Classification via Path Signatures Logistic Regression arXiv:2507.06637v1 Announce Type: new Abstract: We propose Path Signatures Logistic Regression (PSLR), a semi-parametric framework for classifying vector-valued functional data with scalar covariates. Classical functional logistic regression models rely on linear assumptions and fixed basis expansions, which limit flexibility and degrade performance under irregular sampling. PSLR overcomes…
-
Fast Gaussian Processes under Monotonicity Constraints
Fast Gaussian Processes under Monotonicity Constraints arXiv:2507.06677v1 Announce Type: new Abstract: Gaussian processes (GPs) are widely used as surrogate models for complicated functions in scientific and engineering applications. In many cases, prior knowledge about the function to be approximated, such as monotonicity, is available and can be leveraged to improve model fidelity. Incorporating such constraints…
-
Conformal Prediction for Long-Tailed Classification
Conformal Prediction for Long-Tailed Classification arXiv:2507.06867v1 Announce Type: new Abstract: Many real-world classification problems, such as plant identification, have extremely long-tailed class distributions. In order for prediction sets to be useful in such settings, they should (i) provide good class-conditional coverage, ensuring that rare classes are not systematically omitted from the prediction sets, and (ii)…
-
Adaptive collaboration for online personalized distributed learning with heterogeneous clients
Adaptive collaboration for online personalized distributed learning with heterogeneous clients arXiv:2507.06844v1 Announce Type: new Abstract: We study the problem of online personalized decentralized learning with $N$ statistically heterogeneous clients collaborating to accelerate local training. An important challenge in this setting is to select relevant collaborators to reduce gradient variance while mitigating the introduced bias. To…
-
Temporal Conformal Prediction (TCP): A Distribution-Free Statistical and Machine Learning Framework for Adaptive Risk Forecasting
Temporal Conformal Prediction (TCP): A Distribution-Free Statistical and Machine Learning Framework for Adaptive Risk Forecasting arXiv:2507.05470v1 Announce Type: new Abstract: We propose Temporal Conformal Prediction (TCP), a novel framework for constructing prediction intervals in financial time-series with guaranteed finite-sample validity. TCP integrates quantile regression with a conformal calibration layer that adapts online via a decaying…
-
Enjoying Non-linearity in Multinomial Logistic Bandits
Enjoying Non-linearity in Multinomial Logistic Bandits arXiv:2507.05306v1 Announce Type: new Abstract: We consider the multinomial logistic bandit problem, a variant of generalized linear bandits where a learner interacts with an environment by selecting actions to maximize expected rewards based on probabilistic feedback from multiple possible outcomes. In the binary setting, recent work has focused on…
-
A Malliavin calculus approach to score functions in diffusion generative models
A Malliavin calculus approach to score functions in diffusion generative models arXiv:2507.05550v1 Announce Type: new Abstract: Score-based diffusion generative models have recently emerged as a powerful tool for modelling complex data distributions. These models aim at learning the score function, which defines a map from a known probability distribution to the target data distribution via…
-
Property Elicitation on Imprecise Probabilities
Property Elicitation on Imprecise Probabilities arXiv:2507.05857v1 Announce Type: new Abstract: Property elicitation studies which attributes of a probability distribution can be determined by minimising a risk. We investigate a generalisation of property elicitation to imprecise probabilities (IP). This investigation is motivated by multi-distribution learning, which takes the classical machine learning paradigm of minimising a single…
-
Best-of-N through the Smoothing Lens: KL Divergence and Regret Analysis
Best-of-N through the Smoothing Lens: KL Divergence and Regret Analysis arXiv:2507.05913v1 Announce Type: new Abstract: A simple yet effective method for inference-time alignment of generative models is Best-of-$N$ (BoN), where $N$ outcomes are sampled from a reference policy, evaluated using a proxy reward model, and the highest-scoring one is selected. While prior work argues that…