Category: cs.LG

dynoGP: Deep Gaussian Processes for dynamic system identification

dynoGP: Deep Gaussian Processes for dynamic system identification arXiv:2502.05620v1 Announce Type: new Abstract: In this work, we present a novel approach to system identification for dynamical systems, based on a specific class of Deep Gaussian Processes (Deep GPs). These models are constructed by interconnecting linear dynamic GPs (equivalent to stochastic linear time-invariant dynamical systems) and…

February 11, 2025
Generalized Venn and Venn-Abers Calibration with Applications in Conformal Prediction

Generalized Venn and Venn-Abers Calibration with Applications in Conformal Prediction arXiv:2502.05676v1 Announce Type: new Abstract: Ensuring model calibration is critical for reliable predictions, yet popular distribution-free methods, such as histogram binning and isotonic regression, provide only asymptotic guarantees. We introduce a unified framework for Venn and Venn-Abers calibration, generalizing Vovk’s binary classification approach to arbitrary…

February 11, 2025
TD(0) Learning converges for Polynomial mixing and non-linear functions

TD(0) Learning converges for Polynomial mixing and non-linear functions arXiv:2502.05706v1 Announce Type: new Abstract: Theoretical work on Temporal Difference (TD) learning has provided finite-sample and high-probability guarantees for data generated from Markov chains. However, these bounds typically require linear function approximation, instance-dependent step sizes, algorithmic modifications, and restrictive mixing rates. We present theoretical findings for…

February 11, 2025
Sparsity-Based Interpolation of External, Internal and Swap Regret

Sparsity-Based Interpolation of External, Internal and Swap Regret arXiv:2502.04543v1 Announce Type: new Abstract: Focusing on the expert problem in online learning, this paper studies the interpolation of several performance metrics via $phi$-regret minimization, which measures the performance of an algorithm by its regret with respect to an arbitrary action modification rule $phi$. With $d$ experts…

February 10, 2025
Optimistic Algorithms for Adaptive Estimation of the Average Treatment Effect

Optimistic Algorithms for Adaptive Estimation of the Average Treatment Effect arXiv:2502.04673v1 Announce Type: new Abstract: Estimation and inference for the Average Treatment Effect (ATE) is a cornerstone of causal inference and often serves as the foundation for developing procedures for more complicated settings. Although traditionally analyzed in a batch setting, recent advances in martingale theory…

February 10, 2025
Complexity Analysis of Normalizing Constant Estimation: from Jarzynski Equality to Annealed Importance Sampling and beyond

Complexity Analysis of Normalizing Constant Estimation: from Jarzynski Equality to Annealed Importance Sampling and beyond arXiv:2502.04575v1 Announce Type: new Abstract: Given an unnormalized probability density $piproptomathrm{e}^{-V}$, estimating its normalizing constant $Z=int_{mathbb{R}^d}mathrm{e}^{-V(x)}mathrm{d}x$ or free energy $F=-log Z$ is a crucial problem in Bayesian statistics, statistical mechanics, and machine learning. It is challenging especially in high dimensions…

February 10, 2025
A Meta-learner for Heterogeneous Effects in Difference-in-Differences

A Meta-learner for Heterogeneous Effects in Difference-in-Differences arXiv:2502.04699v1 Announce Type: new Abstract: We address the problem of estimating heterogeneous treatment effects in panel data, adopting the popular Difference-in-Differences (DiD) framework under the conditional parallel trends assumption. We propose a novel doubly robust meta-learner for the Conditional Average Treatment Effect on the Treated (CATT), reducing the…

February 10, 2025
PhyloVAE: Unsupervised Learning of Phylogenetic Trees via Variational Autoencoders

PhyloVAE: Unsupervised Learning of Phylogenetic Trees via Variational Autoencoders arXiv:2502.04730v1 Announce Type: new Abstract: Learning informative representations of phylogenetic tree structures is essential for analyzing evolutionary relationships. Classical distance-based methods have been widely used to project phylogenetic trees into Euclidean space, but they are often sensitive to the choice of distance metric and may lack…

February 10, 2025
Two in context learning tasks with complex functions

Two in context learning tasks with complex functions arXiv:2502.03503v1 Announce Type: new Abstract: We examine two in context learning (ICL) tasks with mathematical functions in several train and test settings for transformer models. Our study generalizes work on linear functions by showing that small transformers, even models with attention layers only, can approximate arbitrary polynomial…

February 7, 2025
Multivariate Conformal Prediction using Optimal Transport

Multivariate Conformal Prediction using Optimal Transport arXiv:2502.03609v1 Announce Type: new Abstract: Conformal prediction (CP) quantifies the uncertainty of machine learning models by constructing sets of plausible outputs. These sets are constructed by leveraging a so-called conformity score, a quantity computed using the input point of interest, a prediction model, and past observations. CP sets are…

February 7, 2025
Online Learning Algorithms in Hilbert Spaces with $beta-$ and $phi-$Mixing Sequences

Online Learning Algorithms in Hilbert Spaces with $beta-$ and $phi-$Mixing Sequences arXiv:2502.03551v1 Announce Type: new Abstract: In this paper, we study an online algorithm in a reproducing kernel Hilbert spaces (RKHS) based on a class of dependent processes, called the mixing process. For such a process, the degree of dependence is measured by various mixing…

February 7, 2025
Rule-based Evolving Fuzzy System for Time Series Forecasting: New Perspectives Based on Type-2 Fuzzy Sets Measures Approach

Rule-based Evolving Fuzzy System for Time Series Forecasting: New Perspectives Based on Type-2 Fuzzy Sets Measures Approach arXiv:2502.03650v1 Announce Type: new Abstract: Real-world data contain uncertainty and variations that can be correlated to external variables, known as randomness. An alternative cause of randomness is chaos, which can be an important component of chaotic time series.…

February 7, 2025
Guiding Two-Layer Neural Network Lipschitzness via Gradient Descent Learning Rate Constraints

Guiding Two-Layer Neural Network Lipschitzness via Gradient Descent Learning Rate Constraints arXiv:2502.03792v1 Announce Type: new Abstract: We demonstrate that applying an eventual decay to the learning rate (LR) in empirical risk minimization (ERM), where the mean-squared-error loss is minimized using standard gradient descent (GD) for training a two-layer neural network with Lipschitz activation functions, ensures…

February 7, 2025
Networks with Finite VC Dimension: Pro and Contra

Networks with Finite VC Dimension: Pro and Contra arXiv:2502.02679v1 Announce Type: new Abstract: Approximation and learning of classifiers of large data sets by neural networks in terms of high-dimensional geometry and statistical learning theory are investigated. The influence of the VC dimension of sets of input-output functions of networks on approximation capabilities is compared with…

February 6, 2025
Achievable distributional robustness when the robust risk is only partially identified

Achievable distributional robustness when the robust risk is only partially identified arXiv:2502.02710v1 Announce Type: new Abstract: In safety-critical applications, machine learning models should generalize well under worst-case distribution shifts, that is, have a small robust risk. Invariance-based algorithms can provably take advantage of structural assumptions on the shifts when the training distributions are heterogeneous enough…

February 6, 2025
Algorithms with Calibrated Machine Learning Predictions

Algorithms with Calibrated Machine Learning Predictions arXiv:2502.02861v1 Announce Type: new Abstract: The field of algorithms with predictions incorporates machine learning advice in the design of online algorithms to improve real-world performance. While this theoretical framework often assumes uniform reliability across all predictions, modern machine learning models can now provide instance-level uncertainty estimates. In this paper,…

February 6, 2025
Gap-Dependent Bounds for Federated $Q$-learning

Gap-Dependent Bounds for Federated $Q$-learning arXiv:2502.02859v1 Announce Type: new Abstract: We present the first gap-dependent analysis of regret and communication cost for on-policy federated $Q$-Learning in tabular episodic finite-horizon Markov decision processes (MDPs). Existing FRL methods focus on worst-case scenarios, leading to $sqrt{T}$-type regret bounds and communication cost bounds with a $log T$ term scaling…

February 6, 2025
Uncertainty Quantification with the Empirical Neural Tangent Kernel

Uncertainty Quantification with the Empirical Neural Tangent Kernel arXiv:2502.02870v1 Announce Type: new Abstract: While neural networks have demonstrated impressive performance across various tasks, accurately quantifying uncertainty in their predictions is essential to ensure their trustworthiness and enable widespread adoption in critical systems. Several Bayesian uncertainty quantification (UQ) methods exist that are either cheap or reliable,…

February 6, 2025
Doubly Robust Monte Carlo Tree Search

Doubly Robust Monte Carlo Tree Search arXiv:2502.01672v1 Announce Type: new Abstract: We present Doubly Robust Monte Carlo Tree Search (DR-MCTS), a novel algorithm that integrates Doubly Robust (DR) off-policy estimation into Monte Carlo Tree Search (MCTS) to enhance sample efficiency and decision quality in complex environments. Our approach introduces a hybrid estimator that combines MCTS…

February 5, 2025
Graph Canonical Correlation Analysis

Graph Canonical Correlation Analysis arXiv:2502.01780v1 Announce Type: new Abstract: Canonical correlation analysis (CCA) is a widely used technique for estimating associations between two sets of multi-dimensional variables. Recent advancements in CCA methods have expanded their application to decipher the interactions of multiomics datasets, imaging-omics datasets, and more. However, conventional CCA methods are limited in their…

February 5, 2025
Poisson Hierarchical Indian Buffet Processes for Within and Across Group Sharing of Latent Features-With Indications for Microbiome Species Sampling Models

Poisson Hierarchical Indian Buffet Processes for Within and Across Group Sharing of Latent Features-With Indications for Microbiome Species Sampling Models arXiv:2502.01919v1 Announce Type: new Abstract: In this work, we present a comprehensive Bayesian posterior analysis of what we term Poisson Hierarchical Indian Buffet Processes, designed for complex random sparse count species sampling models that allow…

February 5, 2025
Local minima of the empirical risk in high dimension: General theorems and convex examples

Local minima of the empirical risk in high dimension: General theorems and convex examples arXiv:2502.01953v1 Announce Type: new Abstract: We consider a general model for high-dimensional empirical risk minimization whereby the data $mathbf{x}_i$ are $d$-dimensional isotropic Gaussian vectors, the model is parametrized by $mathbf{Theta}inmathbb{R}^{dtimes k}$, and the loss depends on the data via the projection…

February 5, 2025
Theoretical and Practical Analysis of Fr’echet Regression via Comparison Geometry

Theoretical and Practical Analysis of Fr’echet Regression via Comparison Geometry arXiv:2502.01995v1 Announce Type: new Abstract: Fr’echet regression extends classical regression methods to non-Euclidean metric spaces, enabling the analysis of data relationships on complex structures such as manifolds and graphs. This work establishes a rigorous theoretical analysis for Fr’echet regression through the lens of comparison geometry…

February 5, 2025
Learning Difference-of-Convex Regularizers for Inverse Problems: A Flexible Framework with Theoretical Guarantees

Learning Difference-of-Convex Regularizers for Inverse Problems: A Flexible Framework with Theoretical Guarantees arXiv:2502.00240v1 Announce Type: new Abstract: Learning effective regularization is crucial for solving ill-posed inverse problems, which arise in a wide range of scientific and engineering applications. While data-driven methods that parameterize regularizers using deep neural networks have demonstrated strong empirical performance, they often…

February 4, 2025
Supervised Quadratic Feature Analysis: An Information Geometry Approach to Dimensionality Reduction

Supervised Quadratic Feature Analysis: An Information Geometry Approach to Dimensionality Reduction arXiv:2502.00168v1 Announce Type: new Abstract: Supervised dimensionality reduction aims to map labeled data to a low-dimensional feature space while maximizing class discriminability. Despite the availability of methods for learning complex non-linear features (e.g. Deep Learning), there is an enduring demand for dimensionality reduction methods…

February 4, 2025
Learning to Fuse Temporal Proximity Networks: A Case Study in Chimpanzee Social Interactions

Learning to Fuse Temporal Proximity Networks: A Case Study in Chimpanzee Social Interactions arXiv:2502.00302v1 Announce Type: new Abstract: How can we identify groups of primate individuals which could be conjectured to drive social structure? To address this question, one of us has collected a time series of data for social interactions between chimpanzees. Here we…

February 4, 2025
Decentralized Inference for Distributed Geospatial Data Using Low-Rank Models

Decentralized Inference for Distributed Geospatial Data Using Low-Rank Models arXiv:2502.00309v1 Announce Type: new Abstract: Advancements in information technology have enabled the creation of massive spatial datasets, driving the need for scalable and efficient computational methodologies. While offering viable solutions, centralized frameworks are limited by vulnerabilities such as single-point failures and communication bottlenecks. This paper presents…

February 4, 2025
Variance Reduction via Resampling and Experience Replay

Variance Reduction via Resampling and Experience Replay arXiv:2502.00520v1 Announce Type: new Abstract: Experience replay is a foundational technique in reinforcement learning that enhances learning stability by storing past experiences in a replay buffer and reusing them during training. Despite its practical success, its theoretical properties remain underexplored. In this paper, we present a theoretical framework…

February 4, 2025
Adaptivity and Convergence of Probability Flow ODEs in Diffusion Generative Models

Adaptivity and Convergence of Probability Flow ODEs in Diffusion Generative Models arXiv:2501.18863v1 Announce Type: new Abstract: Score-based generative models, which transform noise into data by learning to reverse a diffusion process, have become a cornerstone of modern generative AI. This paper contributes to establishing theoretical guarantees for the probability flow ODE, a widely used diffusion-based…

February 3, 2025
A Unified Framework for Entropy Search and Expected Improvement in Bayesian Optimization

A Unified Framework for Entropy Search and Expected Improvement in Bayesian Optimization arXiv:2501.18756v1 Announce Type: new Abstract: Bayesian optimization is a widely used method for optimizing expensive black-box functions, with Expected Improvement being one of the most commonly used acquisition functions. In contrast, information-theoretic acquisition functions aim to reduce uncertainty about the function’s optimum and…

February 3, 2025
Trustworthy Evaluation of Generative AI Models

Trustworthy Evaluation of Generative AI Models arXiv:2501.18897v1 Announce Type: new Abstract: Generative AI (GenAI) models have recently achieved remarkable empirical performance in various applications, however, their evaluations yet lack uncertainty quantification. In this paper, we propose a method to compare two generative models based on an unbiased estimator of their relative performance gap. Statistically, our…

February 3, 2025
Optimizing Through Change: Bounds and Recommendations for Time-Varying Bayesian Optimization Algorithms

Optimizing Through Change: Bounds and Recommendations for Time-Varying Bayesian Optimization Algorithms arXiv:2501.18963v1 Announce Type: new Abstract: Time-Varying Bayesian Optimization (TVBO) is the go-to framework for optimizing a time-varying, expensive, noisy black-box function. However, most of the solutions proposed so far either rely on unrealistic assumptions on the nature of the objective function or do not…

February 3, 2025
Optimal Transport-based Conformal Prediction

Optimal Transport-based Conformal Prediction arXiv:2501.18991v1 Announce Type: new Abstract: Conformal Prediction (CP) is a principled framework for quantifying uncertainty in blackbox learning models, by constructing prediction sets with finite-sample coverage guarantees. Traditional approaches rely on scalar nonconformity scores, which fail to fully exploit the geometric structure of multivariate outputs, such as in multi-output regression or…

February 3, 2025
Knoop: Practical Enhancement of Knockoff with Over-Parameterization for Variable Selection

Knoop: Practical Enhancement of Knockoff with Over-Parameterization for Variable Selection arXiv:2501.17889v1 Announce Type: new Abstract: Variable selection plays a crucial role in enhancing modeling effectiveness across diverse fields, addressing the challenges posed by high-dimensional datasets of correlated variables. This work introduces a novel approach namely Knockoff with over-parameterization (Knoop) to enhance Knockoff filters for variable…

January 31, 2025
Heterogeneous Multi-Player Multi-Armed Bandits Robust To Adversarial Attacks

Heterogeneous Multi-Player Multi-Armed Bandits Robust To Adversarial Attacks arXiv:2501.17882v1 Announce Type: new Abstract: We consider a multi-player multi-armed bandit setting in the presence of adversaries that attempt to negatively affect the rewards received by the players in the system. The reward distributions for any given arm are heterogeneous across the players. In the event of…

January 31, 2025
U-aggregation: Unsupervised Aggregation of Multiple Learning Algorithms

U-aggregation: Unsupervised Aggregation of Multiple Learning Algorithms arXiv:2501.18084v1 Announce Type: new Abstract: Across various domains, the growing advocacy for open science and open-source machine learning has made an increasing number of models publicly available. These models allow practitioners to integrate them into their own contexts, reducing the need for extensive data labeling, training, and calibration.…

January 31, 2025
Optimal Survey Design for Private Mean Estimation

Optimal Survey Design for Private Mean Estimation arXiv:2501.18121v1 Announce Type: new Abstract: This work identifies the first privacy-aware stratified sampling scheme that minimizes the variance for general private mean estimation under the Laplace, Discrete Laplace (DLap) and Truncated-Uniform-Laplace (TuLap) mechanisms within the framework of differential privacy (DP). We view stratified sampling as a subsampling operation,…

January 31, 2025
Random Feature Representation Boosting

Random Feature Representation Boosting arXiv:2501.18283v1 Announce Type: new Abstract: We introduce Random Feature Representation Boosting (RFRBoost), a novel method for constructing deep residual random feature neural networks (RFNNs) using boosting theory. RFRBoost uses random features at each layer to learn the functional gradient of the network representation, enhancing performance while preserving the convex optimization benefits…

January 31, 2025
Near-Optimal Algorithms for Omniprediction

Near-Optimal Algorithms for Omniprediction arXiv:2501.17205v1 Announce Type: new Abstract: Omnipredictors are simple prediction functions that encode loss-minimizing predictions with respect to a hypothesis class $H$, simultaneously for every loss function within a class of losses $L$. In this work, we give near-optimal learning algorithms for omniprediction, in both the online and offline settings. To begin,…

January 30, 2025
Testing Conditional Mean Independence Using Generative Neural Networks

Testing Conditional Mean Independence Using Generative Neural Networks arXiv:2501.17345v1 Announce Type: new Abstract: Conditional mean independence (CMI) testing is crucial for statistical tasks including model determination and variable importance evaluation. In this work, we introduce a novel population CMI measure and a bootstrap-based testing procedure that utilizes deep generative neural networks to estimate the conditional…

January 30, 2025
A Survey on Cluster-based Federated Learning

A Survey on Cluster-based Federated Learning arXiv:2501.17512v1 Announce Type: new Abstract: As the industrial and commercial use of Federated Learning (FL) has expanded, so has the need for optimized algorithms. In settings were FL clients’ data is non-independently and identically distributed (non-IID) and with highly heterogeneous distributions, the baseline FL approach seems to fall short.…

January 30, 2025
Exact characterization of {epsilon}-Safe Decision Regions for exponential family distributions and Multi Cost SVM approximation

Exact characterization of {epsilon}-Safe Decision Regions for exponential family distributions and Multi Cost SVM approximation arXiv:2501.17731v1 Announce Type: new Abstract: Probabilistic guarantees on the prediction of data-driven classifiers are necessary to define models that can be considered reliable. This is a key requirement for modern machine learning in which the goodness of a system is…

January 30, 2025
Sequential Learning of the Pareto Front for Multi-objective Bandits

Sequential Learning of the Pareto Front for Multi-objective Bandits arXiv:2501.17513v1 Announce Type: new Abstract: We study the problem of sequential learning of the Pareto front in multi-objective multi-armed bandits. An agent is faced with K possible arms to pull. At each turn she picks one, and receives a vector-valued reward. When she thinks she has…

January 30, 2025
Nonparametric Sparse Online Learning of the Koopman Operator

Nonparametric Sparse Online Learning of the Koopman Operator arXiv:2501.16489v1 Announce Type: new Abstract: The Koopman operator provides a powerful framework for representing the dynamics of general nonlinear dynamical systems. Data-driven techniques to learn the Koopman operator typically assume that the chosen function space is closed under system dynamics. In this paper, we study the Koopman…

January 29, 2025
Variational Schr”odinger Momentum Diffusion

Variational Schr”odinger Momentum Diffusion arXiv:2501.16675v1 Announce Type: new Abstract: The momentum Schr”odinger Bridge (mSB) has emerged as a leading method for accelerating generative diffusion processes and reducing transport costs. However, the lack of simulation-free properties inevitably results in high training costs and affects scalability. To obtain a trade-off between transport properties and scalability, we introduce…

January 29, 2025
Exponential Family Attention

Exponential Family Attention arXiv:2501.16790v1 Announce Type: new Abstract: The self-attention mechanism is the backbone of the transformer neural network underlying most large language models. It can capture complex word patterns and long-range dependencies in natural language. This paper introduces exponential family attention (EFA), a probabilistic generative model that extends self-attention to handle high-dimensional sequence, spatial,…

January 29, 2025
Towards the Generalization of Multi-view Learning: An Information-theoretical Analysis

Towards the Generalization of Multi-view Learning: An Information-theoretical Analysis arXiv:2501.16768v1 Announce Type: new Abstract: Multiview learning has drawn widespread attention for its efficacy in leveraging cross-view consensus and complementarity information to achieve a comprehensive representation of data. While multi-view learning has undergone vigorous development and achieved remarkable success, the theoretical understanding of its generalization behavior…

January 29, 2025
Marginal and Conditional Importance Measures from Machine Learning Models and Their Relationship with Conditional Average Treatment Effect

Marginal and Conditional Importance Measures from Machine Learning Models and Their Relationship with Conditional Average Treatment Effect arXiv:2501.16988v1 Announce Type: new Abstract: Interpreting black-box machine learning models is challenging due to their strong dependence on data and inherently non-parametric nature. This paper reintroduces the concept of importance through “Marginal Variable Importance Metric” (MVIM), a model-agnostic…

January 29, 2025
ED-Filter: Dynamic Feature Filtering for Eating Disorder Classification

ED-Filter: Dynamic Feature Filtering for Eating Disorder Classification arXiv:2501.14785v1 Announce Type: new Abstract: Eating disorders (ED) are critical psychiatric problems that have alarmed the mental health community. Mental health professionals are increasingly recognizing the utility of data derived from social media platforms such as Twitter. However, high dimensionality and extensive feature sets of Twitter data…

January 28, 2025
Explaining Categorical Feature Interactions Using Graph Covariance and LLMs

Explaining Categorical Feature Interactions Using Graph Covariance and LLMs arXiv:2501.14932v1 Announce Type: new Abstract: Modern datasets often consist of numerous samples with abundant features and associated timestamps. Analyzing such datasets to uncover underlying events typically requires complex statistical methods and substantial domain expertise. A notable example, and the primary data focus of this paper, is…

January 28, 2025
Median of Forests for Robust Density Estimation

Median of Forests for Robust Density Estimation arXiv:2501.15157v1 Announce Type: new Abstract: Robust density estimation refers to the consistent estimation of the density function even when the data is contaminated by outliers. We find that existing forest density estimation at a certain point is inherently resistant to the outliers outside the cells containing the point,…

January 28, 2025
Conformal Inference of Individual Treatment Effects Using Conditional Density Estimates

Conformal Inference of Individual Treatment Effects Using Conditional Density Estimates arXiv:2501.14933v1 Announce Type: new Abstract: In an era where diverse and complex data are increasingly accessible, the precise prediction of individual treatment effects (ITE) becomes crucial across fields such as healthcare, economics, and public policy. Current state-of-the-art approaches, while providing valid prediction intervals through Conformal…

January 28, 2025
A Review on Self-Supervised Learning for Time Series Anomaly Detection: Recent Advances and Open Challenges

A Review on Self-Supervised Learning for Time Series Anomaly Detection: Recent Advances and Open Challenges arXiv:2501.15196v1 Announce Type: new Abstract: Time series anomaly detection presents various challenges due to the sequential and dynamic nature of time-dependent data. Traditional unsupervised methods frequently encounter difficulties in generalization, often overfitting to known normal patterns observed during training and…

January 28, 2025
Distributionally Robust Coreset Selection under Covariate Shift

Distributionally Robust Coreset Selection under Covariate Shift arXiv:2501.14253v1 Announce Type: new Abstract: Coreset selection, which involves selecting a small subset from an existing training dataset, is an approach to reducing training data, and various approaches have been proposed for this method. In practical situations where these methods are employed, it is often the case that…

January 27, 2025
EFiGP: Eigen-Fourier Physics-Informed Gaussian Process for Inference of Dynamic Systems

EFiGP: Eigen-Fourier Physics-Informed Gaussian Process for Inference of Dynamic Systems arXiv:2501.14107v1 Announce Type: new Abstract: Parameter estimation and trajectory reconstruction for data-driven dynamical systems governed by ordinary differential equations (ODEs) are essential tasks in fields such as biology, engineering, and physics. These inverse problems — estimating ODE parameters from observational data — are particularly challenging…

January 27, 2025
Statistical Verification of Linear Classifiers

Statistical Verification of Linear Classifiers arXiv:2501.14430v1 Announce Type: new Abstract: We propose a homogeneity test closely related to the concept of linear separability between two samples. Using the test one can answer the question whether a linear classifier is merely “random” or effectively captures differences between two classes. We focus on establishing upper bounds for…

January 27, 2025
coverforest: Conformal Predictions with Random Forest in Python

coverforest: Conformal Predictions with Random Forest in Python arXiv:2501.14570v1 Announce Type: new Abstract: Conformal prediction provides a framework for uncertainty quantification, specifically in the forms of prediction intervals and sets with distribution-free guaranteed coverage. While recent cross-conformal techniques such as CV+ and Jackknife+-after-bootstrap achieve better data efficiency than traditional split conformal methods, they incur substantial…

January 27, 2025
Optimal Transport Barycenter via Nonconvex-Concave Minimax Optimization

Optimal Transport Barycenter via Nonconvex-Concave Minimax Optimization arXiv:2501.14635v1 Announce Type: new Abstract: The optimal transport barycenter (a.k.a. Wasserstein barycenter) is a fundamental notion of averaging that extends from the Euclidean space to the Wasserstein space of probability distributions. Computation of the unregularized barycenter for discretized probability distributions on point clouds is a challenging task when…

January 27, 2025
Robust Amortized Bayesian Inference with Self-Consistency Losses on Unlabeled Data

Robust Amortized Bayesian Inference with Self-Consistency Losses on Unlabeled Data arXiv:2501.13483v1 Announce Type: new Abstract: Neural amortized Bayesian inference (ABI) can solve probabilistic inverse problems orders of magnitude faster than classical methods. However, neural ABI is not yet sufficiently robust for widespread and safe applicability. In particular, when performing inference on observations outside of the…

January 24, 2025
LITE: Efficiently Estimating Gaussian Probability of Maximality

LITE: Efficiently Estimating Gaussian Probability of Maximality arXiv:2501.13535v1 Announce Type: new Abstract: We consider the problem of computing the probability of maximality (PoM) of a Gaussian random vector, i.e., the probability for each dimension to be maximal. This is a key challenge in applications ranging from Bayesian optimization to reinforcement learning, where the PoM not…

January 24, 2025
Learning under Commission and Omission Event Outliers

Learning under Commission and Omission Event Outliers arXiv:2501.13599v1 Announce Type: new Abstract: Event stream is an important data format in real life. The events are usually expected to follow some regular patterns over time. However, the patterns could be contaminated by unexpected absences or occurrences of events. In this paper, we adopt the temporal point…

January 24, 2025
A dimensionality reduction technique based on the Gromov-Wasserstein distance

A dimensionality reduction technique based on the Gromov-Wasserstein distance arXiv:2501.13732v1 Announce Type: new Abstract: Analyzing relationships between objects is a pivotal problem within data science. In this context, Dimensionality reduction (DR) techniques are employed to generate smaller and more manageable data representations. This paper proposes a new method for dimensionality reduction, based on optimal transportation…

January 24, 2025
Ultralow-dimensionality reduction for identifying critical transitions by spatial-temporal PCA

Ultralow-dimensionality reduction for identifying critical transitions by spatial-temporal PCA arXiv:2501.12582v1 Announce Type: new Abstract: Discovering dominant patterns and exploring dynamic behaviors especially critical state transitions and tipping points in high-dimensional time-series data are challenging tasks in study of real-world complex systems, which demand interpretable data representations to facilitate comprehension of both spatial and temporal information…

January 23, 2025
Sequential Change Point Detection via Denoising Score Matching

Sequential Change Point Detection via Denoising Score Matching arXiv:2501.12667v1 Announce Type: new Abstract: Sequential change-point detection plays a critical role in numerous real-world applications, where timely identification of distributional shifts can greatly mitigate adverse outcomes. Classical methods commonly rely on parametric density assumptions of pre- and post-change distributions, limiting their effectiveness for high-dimensional, complex data…

January 23, 2025
On Generalization and Distributional Update for Mimicking Observations with Adequate Exploration

On Generalization and Distributional Update for Mimicking Observations with Adequate Exploration arXiv:2501.12785v1 Announce Type: new Abstract: This paper tackles the efficiency and stability issues in learning from observations (LfO). We commence by investigating how reward functions and policies generalize in LfO. Subsequently, the built-in reinforcement learning (RL) approach in generative adversarial imitation from observation (GAIfO)…

January 23, 2025
Singular leaning coefficients and efficiency in learning theory

Singular leaning coefficients and efficiency in learning theory arXiv:2501.12747v1 Announce Type: new Abstract: Singular learning models with non-positive Fisher information matrices include neural networks, reduced-rank regression, Boltzmann machines, normal mixture models, and others. These models have been widely used in the development of learning machines. However, theoretical analysis is still in its early stages. In…

January 23, 2025
Fixed-Budget Change Point Identification in Piecewise Constant Bandits

Fixed-Budget Change Point Identification in Piecewise Constant Bandits arXiv:2501.12957v1 Announce Type: new Abstract: We study the piecewise constant bandit problem where the expected reward is a piecewise constant function with one change point (discontinuity) across the action space $[0,1]$ and the learner’s aim is to locate the change point. Under the assumption of a fixed…

January 23, 2025
Extension of Symmetrized Neural Network Operators with Fractional and Mixed Activation Functions

Extension of Symmetrized Neural Network Operators with Fractional and Mixed Activation Functions arXiv:2501.10496v1 Announce Type: new Abstract: We propose a novel extension to symmetrized neural network operators by incorporating fractional and mixed activation functions. This study addresses the limitations of existing models in approximating higher-order smooth functions, particularly in complex and high-dimensional spaces. Our framework…

January 22, 2025
Simulation of Random LR Fuzzy Intervals

Simulation of Random LR Fuzzy Intervals arXiv:2501.10482v1 Announce Type: new Abstract: Random fuzzy variables join the modeling of the impreciseness (due to their “fuzzy part”) and randomness. Statistical samples of such objects are widely used, and their direct, numerically effective generation is therefore necessary. Usually, these samples consist of triangular or trapezoidal fuzzy numbers. In…

January 22, 2025
Multi-Output Conformal Regression: A Unified Comparative Study with New Conformity Scores

Multi-Output Conformal Regression: A Unified Comparative Study with New Conformity Scores arXiv:2501.10533v1 Announce Type: new Abstract: Quantifying uncertainty in multivariate regression is essential in many real-world applications, yet existing methods for constructing prediction regions often face limitations such as the inability to capture complex dependencies, lack of coverage guarantees, or high computational cost. Conformal prediction…

January 22, 2025
DPERC: Direct Parameter Estimation for Mixed Data

DPERC: Direct Parameter Estimation for Mixed Data arXiv:2501.10540v1 Announce Type: new Abstract: The covariance matrix is a foundation in numerous statistical and machine-learning applications such as Principle Component Analysis, Correlation Heatmap, etc. However, missing values within datasets present a formidable obstacle to accurately estimating this matrix. While imputation methods offer one avenue for addressing this…

January 22, 2025
Model-Robust and Adaptive-Optimal Transfer Learning for Tackling Concept Shifts in Nonparametric Regression

Model-Robust and Adaptive-Optimal Transfer Learning for Tackling Concept Shifts in Nonparametric Regression arXiv:2501.10870v1 Announce Type: new Abstract: When concept shifts and sample scarcity are present in the target domain of interest, nonparametric regression learners often struggle to generalize effectively. The technique of transfer learning remedies these issues by leveraging data or pre-trained models from similar…

January 22, 2025
SBAMDT: Bayesian Additive Decision Trees with Adaptive Soft Semi-multivariate Split Rules

SBAMDT: Bayesian Additive Decision Trees with Adaptive Soft Semi-multivariate Split Rules arXiv:2501.09900v1 Announce Type: new Abstract: Bayesian Additive Regression Trees [BART, Chipman et al., 2010] have gained significant popularity due to their remarkable predictive performance and ability to quantify uncertainty. However, standard decision tree models rely on recursive data splits at each decision node, using…

January 20, 2025
Tracking student skills real-time through a continuous-variable dynamic Bayesian network

Tracking student skills real-time through a continuous-variable dynamic Bayesian network arXiv:2501.10050v1 Announce Type: new Abstract: The field of Knowledge Tracing is focused on predicting the success rate of a student for a given skill. Modern methods like Deep Knowledge Tracing provide accurate estimates given enough data, but being based on neural networks they struggle to…

January 20, 2025
Statistical Inference for Sequential Feature Selection after Domain Adaptation

Statistical Inference for Sequential Feature Selection after Domain Adaptation arXiv:2501.09933v1 Announce Type: new Abstract: In high-dimensional regression, feature selection methods, such as sequential feature selection (SeqFS), are commonly used to identify relevant features. When data is limited, domain adaptation (DA) becomes crucial for transferring knowledge from a related source domain to a target domain, improving…

January 20, 2025
Contributions to the Decision Theoretic Foundations of Machine Learning and Robust Statistics under Weakly Structured Information

Contributions to the Decision Theoretic Foundations of Machine Learning and Robust Statistics under Weakly Structured Information arXiv:2501.10195v1 Announce Type: new Abstract: This habilitation thesis is cumulative and, therefore, is collecting and connecting research that I (together with several co-authors) have conducted over the last few years. Thus, the absolute core of the work is formed…

January 20, 2025
Provably Safeguarding a Classifier from OOD and Adversarial Samples: an Extreme Value Theory Approach

Provably Safeguarding a Classifier from OOD and Adversarial Samples: an Extreme Value Theory Approach arXiv:2501.10202v1 Announce Type: new Abstract: This paper introduces a novel method, Sample-efficient Probabilistic Detection using Extreme Value Theory (SPADE), which transforms a classifier into an abstaining classifier, offering provable protection against out-of-distribution and adversarial samples. The approach is based on a…

January 20, 2025
Generative Models with ELBOs Converging to Entropy Sums

Generative Models with ELBOs Converging to Entropy Sums arXiv:2501.09022v1 Announce Type: new Abstract: The evidence lower bound (ELBO) is one of the most central objectives for probabilistic unsupervised learning. For the ELBOs of several generative models and model classes, we here prove convergence to entropy sums. As one result, we provide a list of generative…

January 17, 2025
Estimating shared subspace with AJIVE: the power and limitation of multiple data matrices

Estimating shared subspace with AJIVE: the power and limitation of multiple data matrices arXiv:2501.09336v1 Announce Type: new Abstract: Integrative data analysis often requires disentangling joint and individual variations across multiple datasets, a challenge commonly addressed by the Joint and Individual Variation Explained (JIVE) model. While numerous methods have been developed to estimate the shared subspace…

January 17, 2025
On the convergence of noisy Bayesian Optimization with Expected Improvement

On the convergence of noisy Bayesian Optimization with Expected Improvement arXiv:2501.09262v1 Announce Type: new Abstract: Expected improvement (EI) is one of the most widely-used acquisition functions in Bayesian optimization (BO). Despite its proven success in applications for decades, important open questions remain on the theoretical convergence behaviors and rates for EI. In this paper, we…

January 17, 2025
Predictions as Surrogates: Revisiting Surrogate Outcomes in the Age of AI

Predictions as Surrogates: Revisiting Surrogate Outcomes in the Age of AI arXiv:2501.09731v1 Announce Type: new Abstract: We establish a formal connection between the decades-old surrogate outcome model in biostatistics and economics and the emerging field of prediction-powered inference (PPI). The connection treats predictions from pre-trained models, prevalent in the age of AI, as cost-effective surrogates…

January 17, 2025
Gradient Descent Converges Linearly to Flatter Minima than Gradient Flow in Shallow Linear Networks

Gradient Descent Converges Linearly to Flatter Minima than Gradient Flow in Shallow Linear Networks arXiv:2501.09137v1 Announce Type: cross Abstract: We study the gradient descent (GD) dynamics of a depth-2 linear neural network with a single input and output. We show that GD converges at an explicit linear rate to a global minimum of the training…

January 17, 2025
A Constant Velocity Latent Dynamics Approach for Accelerating Simulation of Stiff Nonlinear Systems

A Constant Velocity Latent Dynamics Approach for Accelerating Simulation of Stiff Nonlinear Systems arXiv:2501.08423v1 Announce Type: new Abstract: Solving stiff ordinary differential equations (StODEs) requires sophisticated numerical solvers, which are often computationally expensive. In particular, StODE’s often cannot be solved with traditional explicit time integration schemes and one must resort to costly implicit methods to…

January 16, 2025
Causal vs. Anticausal merging of predictors

Causal vs. Anticausal merging of predictors arXiv:2501.08426v1 Announce Type: cross Abstract: We study the differences arising from merging predictors in the causal and anticausal directions using the same data. In particular we study the asymmetries that arise in a simple model where we merge the predictors using one binary variable as target and two continuous…

January 16, 2025
A Theory of Optimistically Universal Online Learnability for General Concept Classes

A Theory of Optimistically Universal Online Learnability for General Concept Classes arXiv:2501.08551v1 Announce Type: new Abstract: We provide a full characterization of the concept classes that are optimistically universally online learnable with ${0, 1}$ labels. The notion of optimistically universal online learning was defined in [Hanneke, 2021] in order to understand learnability under minimal assumptions.…

January 16, 2025
Quantum Reservoir Computing and Risk Bounds

Quantum Reservoir Computing and Risk Bounds arXiv:2501.08640v1 Announce Type: cross Abstract: We propose a way to bound the generalisation errors of several classes of quantum reservoirs using the Rademacher complexity. We give specific, parameter-dependent bounds for two particular quantum reservoir classes. We analyse how the generalisation bounds scale with growing numbers of qubits. Applying our…

January 16, 2025
Diagonal Over-parameterization in Reproducing Kernel Hilbert Spaces as an Adaptive Feature Model: Generalization and Adaptivity

Diagonal Over-parameterization in Reproducing Kernel Hilbert Spaces as an Adaptive Feature Model: Generalization and Adaptivity arXiv:2501.08679v1 Announce Type: cross Abstract: This paper introduces a diagonal adaptive kernel model that dynamically learns kernel eigenvalues and output coefficients simultaneously during training. Unlike fixed-kernel methods tied to the neural tangent kernel theory, the diagonal adaptive kernel model adapts…

January 16, 2025
Concentration of Measure for Distributions Generated via Diffusion Models

Concentration of Measure for Distributions Generated via Diffusion Models arXiv:2501.07741v1 Announce Type: new Abstract: We show via a combination of mathematical arguments and empirical evidence that data distributions sampled from diffusion models satisfy a Concentration of Measure Property saying that any Lipschitz $1$-dimensional projection of a random vector is not too far from its mean…

January 15, 2025
On the use of Statistical Learning Theory for model selection in Structural Health Monitoring

On the use of Statistical Learning Theory for model selection in Structural Health Monitoring arXiv:2501.08050v1 Announce Type: new Abstract: Whenever data-based systems are employed in engineering applications, defining an optimal statistical representation is subject to the problem of model selection. This paper focusses on how well models can generalise in Structural Health Monitoring (SHM). Although…

January 15, 2025
On the Statistical Capacity of Deep Generative Models

On the Statistical Capacity of Deep Generative Models arXiv:2501.07763v1 Announce Type: new Abstract: Deep generative models are routinely used in generating samples from complex, high-dimensional distributions. Despite their apparent successes, their statistical properties are not well understood. A common assumption is that with enough training data and sufficiently large neural networks, deep generative model samples…

January 15, 2025
Globally Convergent Variational Inference

Globally Convergent Variational Inference arXiv:2501.08201v1 Announce Type: new Abstract: In variational inference (VI), an approximation of the posterior distribution is selected from a family of distributions through numerical optimization. With the most common variational objective function, known as the evidence lower bound (ELBO), only convergence to a local optimum can be guaranteed. In this work,…

January 15, 2025
Avoiding subtraction and division of stochastic signals using normalizing flows: NFdeconvolve

Avoiding subtraction and division of stochastic signals using normalizing flows: NFdeconvolve arXiv:2501.08288v1 Announce Type: new Abstract: Across the scientific realm, we find ourselves subtracting or dividing stochastic signals. For instance, consider a stochastic realization, $x$, generated from the addition or multiplication of two stochastic signals $a$ and $b$, namely $x=a+b$ or $x = ab$. For…

January 15, 2025
Counterfactually Fair Reinforcement Learning via Sequential Data Preprocessing

Counterfactually Fair Reinforcement Learning via Sequential Data Preprocessing arXiv:2501.06366v1 Announce Type: new Abstract: When applied in healthcare, reinforcement learning (RL) seeks to dynamically match the right interventions to subjects to maximize population benefit. However, the learned policy may disproportionately allocate efficacious actions to one subpopulation, creating or exacerbating disparities in other socioeconomically-disadvantaged subgroups. These biases…

January 14, 2025
Computational and Statistical Asymptotic Analysis of the JKO Scheme for Iterative Algorithms to update distributions

Computational and Statistical Asymptotic Analysis of the JKO Scheme for Iterative Algorithms to update distributions arXiv:2501.06408v1 Announce Type: new Abstract: The seminal paper of Jordan, Kinderlehrer, and Otto introduced what is now widely known as the JKO scheme, an iterative algorithmic framework for computing distributions. This scheme can be interpreted as a Wasserstein gradient flow…

January 14, 2025
Variable Selection Methods for Multivariate, Functional, and Complex Biomedical Data in the AI Age

Variable Selection Methods for Multivariate, Functional, and Complex Biomedical Data in the AI Age arXiv:2501.06868v1 Announce Type: new Abstract: Many problems within personalized medicine and digital health rely on the analysis of continuous-time functional biomarkers and other complex data structures emerging from high-resolution patient monitoring. In this context, this work proposes new optimization-based variable selection…

January 14, 2025
Dynamic Causal Structure Discovery and Causal Effect Estimation

Dynamic Causal Structure Discovery and Causal Effect Estimation arXiv:2501.06534v1 Announce Type: new Abstract: To represent the causal relationships between variables, a directed acyclic graph (DAG) is widely utilized in many areas, such as social sciences, epidemics, and genetics. Many causal structure learning approaches are developed to learn the hidden causal structure utilizing deep-learning approaches. However,…

January 14, 2025
Automatic Double Reinforcement Learning in Semiparametric Markov Decision Processes with Applications to Long-Term Causal Inference

Automatic Double Reinforcement Learning in Semiparametric Markov Decision Processes with Applications to Long-Term Causal Inference arXiv:2501.06926v1 Announce Type: new Abstract: Double reinforcement learning (DRL) enables statistically efficient inference on the value of a policy in a nonparametric Markov Decision Process (MDP) given trajectories generated by another policy. However, this approach necessarily requires stringent overlap between…

January 14, 2025
Covariate Dependent Mixture of Bayesian Networks

Covariate Dependent Mixture of Bayesian Networks arXiv:2501.05745v1 Announce Type: new Abstract: Learning the structure of Bayesian networks from data provides insights into underlying processes and the causal relationships that generate the data, but its usefulness depends on the homogeneity of the data population, a condition often violated in real-world applications. In such cases, using a…

January 13, 2025
Outlyingness Scores with Cluster Catch Digraphs

Outlyingness Scores with Cluster Catch Digraphs arXiv:2501.05530v1 Announce Type: new Abstract: This paper introduces two novel, outlyingness scores (OSs) based on Cluster Catch Digraphs (CCDs): Outbound Outlyingness Score (OOS) and Inbound Outlyingness Score (IOS). These scores enhance the interpretability of outlier detection results. Both OSs employ graph-, density-, and distribution-based techniques, tailored to high-dimensional data…

January 13, 2025
Analog Bayesian neural networks are insensitive to the shape of the weight distribution

Analog Bayesian neural networks are insensitive to the shape of the weight distribution arXiv:2501.05564v1 Announce Type: cross Abstract: Recent work has demonstrated that Bayesian neural networks (BNN’s) trained with mean field variational inference (MFVI) can be implemented in analog hardware, promising orders of magnitude energy savings compared to the standard digital implementations. However, while Gaussians…

January 13, 2025