Category: stat.CO
-
Counterdiabatic Hamiltonian Monte Carlo
Counterdiabatic Hamiltonian Monte Carlo arXiv:2602.21272v1 Announce Type: new Abstract: Hamiltonian Monte Carlo (HMC) is a state of the art method for sampling from distributions with differentiable densities, but can converge slowly when applied to challenging multimodal problems. Running HMC with a time varying Hamiltonian, in order to interpolate from an initial tractable distribution to the…
-
Stochastic Gradient Variational Inference with Price’s Gradient Estimator from Bures-Wasserstein to Parameter Space
Stochastic Gradient Variational Inference with Price’s Gradient Estimator from Bures-Wasserstein to Parameter Space arXiv:2602.18718v1 Announce Type: new Abstract: For approximating a target distribution given only its unnormalized log-density, stochastic gradient-based variational inference (VI) algorithms are a popular approach. For example, Wasserstein VI (WVI) and black-box VI (BBVI) perform gradient descent in measure space (Bures-Wasserstein space)…
-
Amortised and provably-robust simulation-based inference
Amortised and provably-robust simulation-based inference arXiv:2602.11325v1 Announce Type: new Abstract: Complex simulator-based models are now routinely used to perform inference across the sciences and engineering, but existing inference methods are often unable to account for outliers and other extreme values in data which occur due to faulty measurement instruments or human error. In this paper,…
-
Learning Multi-type heterogeneous interacting particle systems
Learning Multi-type heterogeneous interacting particle systems arXiv:2602.03954v1 Announce Type: new Abstract: We propose a framework for the joint inference of network topology, multi-type interaction kernels, and latent type assignments in heterogeneous interacting particle systems from multi-trajectory data. This learning task is a challenging non-convex mixed-integer optimization problem, which we address through a novel three-stage approach.…
-
Latent-IMH: Efficient Bayesian Inference for Inverse Problems with Approximate Operators
Latent-IMH: Efficient Bayesian Inference for Inverse Problems with Approximate Operators arXiv:2601.20888v1 Announce Type: new Abstract: We study sampling from posterior distributions in Bayesian linear inverse problems where $A$, the parameters to observables operator, is computationally expensive. In many applications, $A$ can be factored in a manner that facilitates the construction of a cost-effective approximation $tilde{A}$.…
-
Semi-Supervised Mixture Models under the Concept of Missing at Radom with Margin Confidence and Aranda Ordaz Function
Semi-Supervised Mixture Models under the Concept of Missing at Radom with Margin Confidence and Aranda Ordaz Function arXiv:2601.14631v1 Announce Type: new Abstract: This paper presents a semi-supervised learning framework for Gaussian mixture modelling under a Missing at Random (MAR) mechanism. The method explicitly parameterizes the missingness mechanism by modelling the probability of missingness as a…
-
Accelerated Regularized Wasserstein Proximal Sampling Algorithms
Accelerated Regularized Wasserstein Proximal Sampling Algorithms arXiv:2601.09848v1 Announce Type: new Abstract: We consider sampling from a Gibbs distribution by evolving a finite number of particles using a particular score estimator rather than Brownian motion. To accelerate the particles, we consider a second-order score-based ODE, similar to Nesterov acceleration. In contrast to traditional kernel density score…
-
Tail-Sensitive KL and R’enyi Convergence of Unadjusted Hamiltonian Monte Carlo via One-Shot Couplings
Tail-Sensitive KL and R’enyi Convergence of Unadjusted Hamiltonian Monte Carlo via One-Shot Couplings arXiv:2601.09019v1 Announce Type: new Abstract: Hamiltonian Monte Carlo (HMC) algorithms are among the most widely used sampling methods in high dimensional settings, yet their convergence properties are poorly understood in divergences that quantify relative density mismatch, such as Kullback-Leibler (KL) and R’enyi…
-
A Statistical Assessment of Amortized Inference Under Signal-to-Noise Variation and Distribution Shift
A Statistical Assessment of Amortized Inference Under Signal-to-Noise Variation and Distribution Shift arXiv:2601.07944v1 Announce Type: new Abstract: Since the turn of the century, approximate Bayesian inference has steadily evolved as new computational techniques have been incorporated to handle increasingly complex and large-scale predictive problems. The recent success of deep neural networks and foundation models has…
-
A Bayesian Generative Modeling Approach for Arbitrary Conditional Inference
A Bayesian Generative Modeling Approach for Arbitrary Conditional Inference arXiv:2601.05355v1 Announce Type: new Abstract: Modern data analysis increasingly requires flexible conditional inference P(X_B | X_A) where (X_A, X_B) is an arbitrary partition of observed variable X. Existing conditional inference methods lack this flexibility as they are tied to a fixed conditioning structure and cannot perform…
-
Generative Bayesian Hyperparameter Tuning
Generative Bayesian Hyperparameter Tuning arXiv:2512.20051v1 Announce Type: new Abstract: noindent Hyper-parameter selection is a central practical problem in modern machine learning, governing regularization strength, model capacity, and robustness choices. Cross-validation is often computationally prohibitive at scale, while fully Bayesian hyper-parameter learning can be difficult due to the cost of posterior sampling. We develop a generative…
-
Sampling from multimodal distributions with warm starts: Non-asymptotic bounds for the Reweighted Annealed Leap-Point Sampler
Sampling from multimodal distributions with warm starts: Non-asymptotic bounds for the Reweighted Annealed Leap-Point Sampler arXiv:2512.17977v1 Announce Type: new Abstract: Sampling from multimodal distributions is a central challenge in Bayesian inference and machine learning. In light of hardness results for sampling — classical MCMC methods, even with tempering, can suffer from exponential mixing times —…
-
Fast and Robust: Computationally Efficient Covariance Estimation for Sub-Weibull Vectors
Fast and Robust: Computationally Efficient Covariance Estimation for Sub-Weibull Vectors arXiv:2512.17632v1 Announce Type: new Abstract: High-dimensional covariance estimation is notoriously sensitive to outliers. While statistically optimal estimators exist for general heavy-tailed distributions, they often rely on computationally expensive techniques like semidefinite programming or iterative M-estimation ($O(d^3)$). In this work, we target the specific regime of…
-
Improving the Accuracy of Amortized Model Comparison with Self-Consistency
Improving the Accuracy of Amortized Model Comparison with Self-Consistency arXiv:2512.14308v1 Announce Type: new Abstract: Amortized Bayesian inference (ABI) offers fast, scalable approximations to posterior densities by training neural surrogates on data simulated from the statistical model. However, ABI methods are highly sensitive to model misspecification: when observed data fall outside the training distribution (generative scope…
-
The Interplay of Statistics and Noisy Optimization: Learning Linear Predictors with Random Data Weights
The Interplay of Statistics and Noisy Optimization: Learning Linear Predictors with Random Data Weights arXiv:2512.10188v1 Announce Type: new Abstract: We analyze gradient descent with randomly weighted data points in a linear regression model, under a generic weighting distribution. This includes various forms of stochastic gradient descent, importance sampling, but also extends to weighting distributions with…
-
Functional Random Forest with Adaptive Cost-Sensitive Splitting for Imbalanced Functional Data Classification
Functional Random Forest with Adaptive Cost-Sensitive Splitting for Imbalanced Functional Data Classification arXiv:2512.07888v1 Announce Type: new Abstract: Classification of functional data where observations are curves or trajectories poses unique challenges, particularly under severe class imbalance. Traditional Random Forest algorithms, while robust for tabular data, often fail to capture the intrinsic structure of functional observations and…
-
Optimization and Regularization Under Arbitrary Objectives
Optimization and Regularization Under Arbitrary Objectives arXiv:2511.19628v1 Announce Type: new Abstract: This study investigates the limitations of applying Markov Chain Monte Carlo (MCMC) methods to arbitrary objective functions, focusing on a two-block MCMC framework which alternates between Metropolis-Hastings and Gibbs sampling. While such approaches are often considered advantageous for enabling data-driven regularization, we show that…
-
Convex Clustering Redefined: Robust Learning with the Median of Means Estimator
Convex Clustering Redefined: Robust Learning with the Median of Means Estimator arXiv:2511.14784v1 Announce Type: new Abstract: Clustering approaches that utilize convex loss functions have recently attracted growing interest in the formation of compact data clusters. Although classical methods like k-means and its wide family of variants are still widely used, all of them require the…
-
Fast Riemannian-manifold Hamiltonian Monte Carlo for hierarchical Gaussian-process models
Fast Riemannian-manifold Hamiltonian Monte Carlo for hierarchical Gaussian-process models arXiv:2511.06407v1 Announce Type: new Abstract: Hierarchical Bayesian models based on Gaussian processes are considered useful for describing complex nonlinear statistical dependencies among variables in real-world data. However, effective Monte Carlo algorithms for inference with these models have not yet been established, except for several simple cases.…
-
Learning Paths for Dynamic Measure Transport: A Control Perspective
Learning Paths for Dynamic Measure Transport: A Control Perspective arXiv:2511.03797v1 Announce Type: new Abstract: We bring a control perspective to the problem of identifying paths of measures for sampling via dynamic measure transport (DMT). We highlight the fact that commonly used paths may be poor choices for DMT and connect existing methods for learning alternate…
-
A new class of Markov random fields enabling lightweight sampling
A new class of Markov random fields enabling lightweight sampling arXiv:2511.02373v1 Announce Type: new Abstract: This work addresses the problem of efficient sampling of Markov random fields (MRF). The sampling of Potts or Ising MRF is most often based on Gibbs sampling, and is thus computationally expensive. We consider in this work how to circumvent…
-
Gradient Boosted Mixed Models: Flexible Joint Estimation of Mean and Variance Components for Clustered Data
Gradient Boosted Mixed Models: Flexible Joint Estimation of Mean and Variance Components for Clustered Data arXiv:2511.00217v1 Announce Type: new Abstract: Linear mixed models are widely used for clustered data, but their reliance on parametric forms limits flexibility in complex and high-dimensional settings. In contrast, gradient boosting methods achieve high predictive accuracy through nonparametric estimation, but…
-
The analogy theorem in Hoare logic
The analogy theorem in Hoare logic arXiv:2510.03685v1 Announce Type: new Abstract: The introduction of machine learning methods has led to significant advances in automation, optimization, and discoveries in various fields of science and technology. However, their widespread application faces a fundamental limitation: the transfer of models between data domains generally lacks a rigorous mathematical justification.…
-
Adaptive randomized pivoting and volume sampling
Adaptive randomized pivoting and volume sampling arXiv:2510.02513v1 Announce Type: new Abstract: Adaptive randomized pivoting (ARP) is a recently proposed and highly effective algorithm for column subset selection. This paper reinterprets the ARP algorithm by drawing connections to the volume sampling distribution and active learning algorithms for linear regression. As consequences, this paper presents new analysis…
-
Diffusion and Flow-based Copulas: Forgetting and Remembering Dependencies
Diffusion and Flow-based Copulas: Forgetting and Remembering Dependencies arXiv:2509.19707v1 Announce Type: new Abstract: Copulas are a fundamental tool for modelling multivariate dependencies in data, forming the method of choice in diverse fields and applications. However, the adoption of existing models for multimodal and high-dimensional dependencies is hindered by restrictive assumptions and poor scaling. In this…
-
Scalable extensions to given-data Sobol’ index estimators
Scalable extensions to given-data Sobol’ index estimators arXiv:2509.09078v1 Announce Type: new Abstract: Given-data methods for variance-based sensitivity analysis have significantly advanced the feasibility of Sobol’ index computation for computationally expensive models and models with many inputs. However, the limitations of existing methods still preclude their application to models with an extremely large number of inputs.…
-
Gaussian process surrogate with physical law-corrected prior for multi-coupled PDEs defined on irregular geometry
Gaussian process surrogate with physical law-corrected prior for multi-coupled PDEs defined on irregular geometry arXiv:2509.02617v1 Announce Type: new Abstract: Parametric partial differential equations (PDEs) are fundamental mathematical tools for modeling complex physical systems, yet their numerical evaluation across parameter spaces remains computationally intensive when using conventional high-fidelity solvers. To address this challenge, we propose a…
-
Adaptive generative moment matching networks for improved learning of dependence structures
Adaptive generative moment matching networks for improved learning of dependence structures arXiv:2508.21531v1 Announce Type: new Abstract: An adaptive bandwidth selection procedure for the mixture kernel in the maximum mean discrepancy (MMD) for fitting generative moment matching networks (GMMNs) is introduced, and its ability to improve the learning of copula random number generators is demonstrated. Based…
-
Towards Trustworthy Amortized Bayesian Model Comparison
Towards Trustworthy Amortized Bayesian Model Comparison arXiv:2508.20614v1 Announce Type: new Abstract: Amortized Bayesian model comparison (BMC) enables fast probabilistic ranking of models via simulation-based training of neural surrogates. However, the reliability of neural surrogates deteriorates when simulation models are misspecified – the very case where model comparison is most needed. Thus, we supplement simulation-based training…
-
Evaluation and Optimization of Leave-one-out Cross-validation for the Lasso
Evaluation and Optimization of Leave-one-out Cross-validation for the Lasso arXiv:2508.14368v1 Announce Type: new Abstract: I develop an algorithm to produce the piecewise quadratic that computes leave-one-out cross-validation for the lasso as a function of its hyperparameter. The algorithm can be used to find exact hyperparameters that optimize leave-one-out cross-validation either globally or locally, and its…
-
An Introduction to Sliced Optimal Transport
An Introduction to Sliced Optimal Transport arXiv:2508.12519v1 Announce Type: new Abstract: Sliced Optimal Transport (SOT) is a rapidly developing branch of optimal transport (OT) that exploits the tractability of one-dimensional OT problems. By combining tools from OT, integral geometry, and computational statistics, SOT enables fast and scalable computation of distances, barycenters, and kernels for probability…
-
On computing and the complexity of computing higher-order $U$-statistics, exactly
On computing and the complexity of computing higher-order $U$-statistics, exactly arXiv:2508.12627v1 Announce Type: new Abstract: Higher-order $U$-statistics abound in fields such as statistics, machine learning, and computer science, but are known to be highly time-consuming to compute in practice. Despite their widespread appearance, a comprehensive study of their computational complexity is surprisingly lacking. This paper…
-
L1-Regularized Functional Support Vector Machine
L1-Regularized Functional Support Vector Machine arXiv:2508.05567v1 Announce Type: new Abstract: In functional data analysis, binary classification with one functional covariate has been extensively studied. We aim to fill in the gap of considering multivariate functional covariates in classification. In particular, we propose an $L_1$-regularized functional support vector machine for binary classification. An accompanying algorithm is…
-
Efficient optimization of expensive black-box simulators via marginal means, with application to neutrino detector design
Efficient optimization of expensive black-box simulators via marginal means, with application to neutrino detector design arXiv:2508.01834v1 Announce Type: new Abstract: With advances in scientific computing, computer experiments are increasingly used for optimizing complex systems. However, for modern applications, e.g., the optimization of nuclear physics detectors, each experiment run can require hundreds of CPU hours, making…
-
Sliding Window Informative Canonical Correlation Analysis
Sliding Window Informative Canonical Correlation Analysis arXiv:2507.17921v1 Announce Type: new Abstract: Canonical correlation analysis (CCA) is a technique for finding correlated sets of features between two datasets. In this paper, we propose a novel extension of CCA to the online, streaming data setting: Sliding Window Informative Canonical Correlation Analysis (SWICCA). Our method uses a streaming…
-
Bayesian Double Descent
Bayesian Double Descent arXiv:2507.07338v1 Announce Type: new Abstract: Double descent is a phenomenon of over-parameterized statistical models. Our goal is to view double descent from a Bayesian perspective. Over-parameterized models such as deep neural networks have an interesting re-descending property in their risk characteristics. This is a recent phenomenon in machine learning and has been…
-
Parsimonious Gaussian mixture models with piecewise-constant eigenvalue profiles
Parsimonious Gaussian mixture models with piecewise-constant eigenvalue profiles arXiv:2507.01542v1 Announce Type: new Abstract: Gaussian mixture models (GMMs) are ubiquitous in statistical learning, particularly for unsupervised problems. While full GMMs suffer from the overparameterization of their covariance matrices in high-dimensional spaces, spherical GMMs (with isotropic covariance matrices) certainly lack flexibility to fit certain anisotropic distributions. Connecting…
-
Forecasting Geopolitical Events with a Sparse Temporal Fusion Transformer and Gaussian Process Hybrid: A Case Study in Middle Eastern and U.S. Conflict Dynamics
Forecasting Geopolitical Events with a Sparse Temporal Fusion Transformer and Gaussian Process Hybrid: A Case Study in Middle Eastern and U.S. Conflict Dynamics arXiv:2506.20935v1 Announce Type: new Abstract: Forecasting geopolitical conflict from data sources like the Global Database of Events, Language, and Tone (GDELT) is a critical challenge for national security. The inherent sparsity, burstiness,…
-
Multilevel neural simulation-based inference
Multilevel neural simulation-based inference arXiv:2506.06087v1 Announce Type: new Abstract: Neural simulation-based inference (SBI) is a popular set of methods for Bayesian inference when models are only available in the form of a simulator. These methods are widely used in the sciences and engineering, where writing down a likelihood can be significantly more challenging than constructing…
-
Riemannian Principal Component Analysis
Riemannian Principal Component Analysis arXiv:2506.00226v1 Announce Type: new Abstract: This paper proposes an innovative extension of Principal Component Analysis (PCA) that transcends the traditional assumption of data lying in Euclidean space, enabling its application to data on Riemannian manifolds. The primary challenge addressed is the lack of vector space operations on such manifolds. Fletcher et…
-
Nearly Dimension-Independent Convergence of Mean-Field Black-Box Variational Inference
Nearly Dimension-Independent Convergence of Mean-Field Black-Box Variational Inference arXiv:2505.21721v1 Announce Type: new Abstract: We prove that, given a mean-field location-scale variational family, black-box variational inference (BBVI) with the reparametrization gradient converges at an almost dimension-independent rate. Specifically, for strongly log-concave and log-smooth targets, the number of iterations for BBVI with a sub-Gaussian family to achieve…
-
Online Statistical Inference of Constrained Stochastic Optimization via Random Scaling
Online Statistical Inference of Constrained Stochastic Optimization via Random Scaling arXiv:2505.18327v1 Announce Type: new Abstract: Constrained stochastic nonlinear optimization problems have attracted significant attention for their ability to model complex real-world scenarios in physics, economics, and biology. As datasets continue to grow, online inference methods have become crucial for enabling real-time decision-making without the need…
-
Liouville PDE-based sliced-Wasserstein flow for fair regression
Liouville PDE-based sliced-Wasserstein flow for fair regression arXiv:2505.17204v1 Announce Type: new Abstract: The sliced Wasserstein flow (SWF), a nonparametric and implicit generative gradient flow, is applied to fair regression. We have improved the SWF in a few aspects. First, the stochastic diffusive term from the Fokker-Planck equation-based Monte Carlo is transformed to Liouville partial differential…
-
Scalable Bayesian Monte Carlo: fast uncertainty estimation beyond deep ensembles
Scalable Bayesian Monte Carlo: fast uncertainty estimation beyond deep ensembles arXiv:2505.13585v1 Announce Type: new Abstract: This work introduces a new method called scalable Bayesian Monte Carlo (SBMC). The model interpolates between a point estimator and the posterior, and the algorithm is a parallel implementation of a consistent (asymptotically unbiased) Bayesian deep learning algorithm: sequential Monte…
-
Humble your Overconfident Networks: Unlearning Overfitting via Sequential Monte Carlo Tempered Deep Ensembles
Humble your Overconfident Networks: Unlearning Overfitting via Sequential Monte Carlo Tempered Deep Ensembles arXiv:2505.11671v1 Announce Type: new Abstract: Sequential Monte Carlo (SMC) methods offer a principled approach to Bayesian uncertainty quantification but are traditionally limited by the need for full-batch gradient evaluations. We introduce a scalable variant by incorporating Stochastic Gradient Hamiltonian Monte Carlo (SGHMC)…
-
Fast Likelihood-Free Parameter Estimation for L’evy Processes
Fast Likelihood-Free Parameter Estimation for L’evy Processes arXiv:2505.01639v1 Announce Type: new Abstract: L’evy processes are widely used in financial modeling due to their ability to capture discontinuities and heavy tails, which are common in high-frequency asset return data. However, parameter estimation remains a challenge when associated likelihoods are unavailable or costly to compute. We propose…
-
Bayesian learning of the optimal action-value function in a Markov decision process
Bayesian learning of the optimal action-value function in a Markov decision process arXiv:2505.01859v1 Announce Type: new Abstract: The Markov Decision Process (MDP) is a popular framework for sequential decision-making problems, and uncertainty quantification is an essential component of it to learn optimal decision-making strategies. In particular, a Bayesian framework is used to maintain beliefs about…
-
Extended Fiducial Inference for Individual Treatment Effects via Deep Neural Networks
Extended Fiducial Inference for Individual Treatment Effects via Deep Neural Networks arXiv:2505.01995v1 Announce Type: new Abstract: Individual treatment effect estimation has gained significant attention in recent data science literature. This work introduces the Double Neural Network (Double-NN) method to address this problem within the framework of extended fiducial inference (EFI). In the proposed method, deep…
-
Gaussian Differential Private Bootstrap by Subsampling
Gaussian Differential Private Bootstrap by Subsampling arXiv:2505.01197v1 Announce Type: new Abstract: Bootstrap is a common tool for quantifying uncertainty in data analysis. However, besides additional computational costs in the application of the bootstrap on massive data, a challenging problem in bootstrap based inference under Differential Privacy consists in the fact that it requires repeated access…
-
Gradient-Free Sequential Bayesian Experimental Design via Interacting Particle Systems
Gradient-Free Sequential Bayesian Experimental Design via Interacting Particle Systems arXiv:2504.13320v1 Announce Type: new Abstract: We introduce a gradient-free framework for Bayesian Optimal Experimental Design (BOED) in sequential settings, aimed at complex systems where gradient information is unavailable. Our method combines Ensemble Kalman Inversion (EKI) for design optimization with the Affine-Invariant Langevin Dynamics (ALDI) sampler for…
-
Online Multivariate Regularized Distributional Regression for High-dimensional Probabilistic Electricity Price Forecasting
Online Multivariate Regularized Distributional Regression for High-dimensional Probabilistic Electricity Price Forecasting arXiv:2504.02518v1 Announce Type: new Abstract: Probabilistic electricity price forecasting (PEPF) is a key task for market participants in short-term electricity markets. The increasing availability of high-frequency data and the need for real-time decision-making in energy markets require online estimation methods for efficient model updating.…
-
Tuning Sequential Monte Carlo Samplers via Greedy Incremental Divergence Minimization
Tuning Sequential Monte Carlo Samplers via Greedy Incremental Divergence Minimization arXiv:2503.15704v1 Announce Type: new Abstract: The performance of sequential Monte Carlo (SMC) samplers heavily depends on the tuning of the Markov kernels used in the path proposal. For SMC samplers with unadjusted Markov kernels, standard tuning objectives, such as the Metropolis-Hastings acceptance rate or the…
-
Explainable Bayesian deep learning through input-skip Latent Binary Bayesian Neural Networks
Explainable Bayesian deep learning through input-skip Latent Binary Bayesian Neural Networks arXiv:2503.10496v1 Announce Type: new Abstract: Modeling natural phenomena with artificial neural networks (ANNs) often provides highly accurate predictions. However, ANNs often suffer from over-parameterization, complicating interpretation and raising uncertainty issues. Bayesian neural networks (BNNs) address the latter by representing weights as probability distributions, allowing…
-
A Deep Bayesian Nonparametric Framework for Robust Mutual Information Estimation
A Deep Bayesian Nonparametric Framework for Robust Mutual Information Estimation arXiv:2503.08902v1 Announce Type: new Abstract: Mutual Information (MI) is a crucial measure for capturing dependencies between variables, but exact computation is challenging in high dimensions with intractable likelihoods, impacting accuracy and robustness. One idea is to use an auxiliary neural network to train an MI…
-
Multiple Linked Tensor Factorization
Multiple Linked Tensor Factorization arXiv:2502.20286v1 Announce Type: new Abstract: In biomedical research and other fields, it is now common to generate high content data that are both multi-source and multi-way. Multi-source data are collected from different high-throughput technologies while multi-way data are collected over multiple dimensions, yielding multiple tensor arrays. Integrative analysis of these data…
-
Generative Adversarial Networks for High-Dimensional Item Factor Analysis: A Deep Adversarial Learning Algorithm
Generative Adversarial Networks for High-Dimensional Item Factor Analysis: A Deep Adversarial Learning Algorithm arXiv:2502.10650v1 Announce Type: new Abstract: Advances in deep learning and representation learning have transformed item factor analysis (IFA) in the item response theory (IRT) literature by enabling more efficient and accurate parameter estimation. Variational Autoencoders (VAEs) have been one of the most…
-
Non-asymptotic Analysis of Diffusion Annealed Langevin Monte Carlo for Generative Modelling
Non-asymptotic Analysis of Diffusion Annealed Langevin Monte Carlo for Generative Modelling arXiv:2502.09306v1 Announce Type: new Abstract: We investigate the theoretical properties of general diffusion (interpolation) paths and their Langevin Monte Carlo implementation, referred to as diffusion annealed Langevin Monte Carlo (DALMC), under weak conditions on the data distribution. Specifically, we analyse and provide non-asymptotic error…
-
Online Covariance Matrix Estimation in Sketched Newton Methods
Online Covariance Matrix Estimation in Sketched Newton Methods arXiv:2502.07114v1 Announce Type: new Abstract: Given the ubiquity of streaming data, online algorithms have been widely used for parameter estimation, with second-order methods particularly standing out for their efficiency and robustness. In this paper, we study an online sketched Newton method that leverages a randomized sketching technique…
-
Complexity Analysis of Normalizing Constant Estimation: from Jarzynski Equality to Annealed Importance Sampling and beyond
Complexity Analysis of Normalizing Constant Estimation: from Jarzynski Equality to Annealed Importance Sampling and beyond arXiv:2502.04575v1 Announce Type: new Abstract: Given an unnormalized probability density $piproptomathrm{e}^{-V}$, estimating its normalizing constant $Z=int_{mathbb{R}^d}mathrm{e}^{-V(x)}mathrm{d}x$ or free energy $F=-log Z$ is a crucial problem in Bayesian statistics, statistical mechanics, and machine learning. It is challenging especially in high dimensions…
-
Decentralized Inference for Distributed Geospatial Data Using Low-Rank Models
Decentralized Inference for Distributed Geospatial Data Using Low-Rank Models arXiv:2502.00309v1 Announce Type: new Abstract: Advancements in information technology have enabled the creation of massive spatial datasets, driving the need for scalable and efficient computational methodologies. While offering viable solutions, centralized frameworks are limited by vulnerabilities such as single-point failures and communication bottlenecks. This paper presents…
-
coverforest: Conformal Predictions with Random Forest in Python
coverforest: Conformal Predictions with Random Forest in Python arXiv:2501.14570v1 Announce Type: new Abstract: Conformal prediction provides a framework for uncertainty quantification, specifically in the forms of prediction intervals and sets with distribution-free guaranteed coverage. While recent cross-conformal techniques such as CV+ and Jackknife+-after-bootstrap achieve better data efficiency than traditional split conformal methods, they incur substantial…
-
LITE: Efficiently Estimating Gaussian Probability of Maximality
LITE: Efficiently Estimating Gaussian Probability of Maximality arXiv:2501.13535v1 Announce Type: new Abstract: We consider the problem of computing the probability of maximality (PoM) of a Gaussian random vector, i.e., the probability for each dimension to be maximal. This is a key challenge in applications ranging from Bayesian optimization to reinforcement learning, where the PoM not…
-
Simulation of Random LR Fuzzy Intervals
Simulation of Random LR Fuzzy Intervals arXiv:2501.10482v1 Announce Type: new Abstract: Random fuzzy variables join the modeling of the impreciseness (due to their “fuzzy part”) and randomness. Statistical samples of such objects are widely used, and their direct, numerically effective generation is therefore necessary. Usually, these samples consist of triangular or trapezoidal fuzzy numbers. In…
-
Majorization-Minimization Dual Stagewise Algorithm for Generalized Lasso
Majorization-Minimization Dual Stagewise Algorithm for Generalized Lasso arXiv:2501.02197v1 Announce Type: new Abstract: The generalized lasso is a natural generalization of the celebrated lasso approach to handle structural regularization problems. Many important methods and applications fall into this framework, including fused lasso, clustered lasso, and constrained lasso. To elevate its effectiveness in large-scale problems, extensive research…
-
Leveraging Black-box Models to Assess Feature Importance in Unconditional Distribution
Leveraging Black-box Models to Assess Feature Importance in Unconditional Distribution arXiv:2412.05759v1 Announce Type: new Abstract: Understanding how changes in explanatory features affect the unconditional distribution of the outcome is important in many applications. However, existing black-box predictive models are not readily suited for analyzing such questions. In this work, we develop an approximation method to…
-
The Polynomial Stein Discrepancy for Assessing Moment Convergence
The Polynomial Stein Discrepancy for Assessing Moment Convergence arXiv:2412.05135v1 Announce Type: new Abstract: We propose a novel method for measuring the discrepancy between a set of samples and a desired posterior distribution for Bayesian inference. Classical methods for assessing sample quality like the effective sample size are not appropriate for scalable Bayesian sampling algorithms, such…
-
Community Detection with Heterogeneous Block Covariance Model
Community Detection with Heterogeneous Block Covariance Model arXiv:2412.03780v1 Announce Type: new Abstract: Community detection is the task of clustering objects based on their pairwise relationships. Most of the model-based community detection methods, such as the stochastic block model and its variants, are designed for networks with binary (yes/no) edges. In many practical scenarios, edges often…
-
Pathwise optimization for bridge-type estimators and its applications
Pathwise optimization for bridge-type estimators and its applications arXiv:2412.04047v1 Announce Type: new Abstract: Sparse parametric models are of great interest in statistical learning and are often analyzed by means of regularized estimators. Pathwise methods allow to efficiently compute the full solution path for penalized estimators, for any possible value of the penalization parameter $lambda$. In…