Category: stat.ML

Robust Amortized Bayesian Inference with Self-Consistency Losses on Unlabeled Data

Robust Amortized Bayesian Inference with Self-Consistency Losses on Unlabeled Data arXiv:2501.13483v1 Announce Type: new Abstract: Neural amortized Bayesian inference (ABI) can solve probabilistic inverse problems orders of magnitude faster than classical methods. However, neural ABI is not yet sufficiently robust for widespread and safe applicability. In particular, when performing inference on observations outside of the…

January 24, 2025
LITE: Efficiently Estimating Gaussian Probability of Maximality

LITE: Efficiently Estimating Gaussian Probability of Maximality arXiv:2501.13535v1 Announce Type: new Abstract: We consider the problem of computing the probability of maximality (PoM) of a Gaussian random vector, i.e., the probability for each dimension to be maximal. This is a key challenge in applications ranging from Bayesian optimization to reinforcement learning, where the PoM not…

January 24, 2025
Learning under Commission and Omission Event Outliers

Learning under Commission and Omission Event Outliers arXiv:2501.13599v1 Announce Type: new Abstract: Event stream is an important data format in real life. The events are usually expected to follow some regular patterns over time. However, the patterns could be contaminated by unexpected absences or occurrences of events. In this paper, we adopt the temporal point…

January 24, 2025
Bayesian Model Parameter Learning in Linear Inverse Problems with Application in EEG Focal Source Imaging

Bayesian Model Parameter Learning in Linear Inverse Problems with Application in EEG Focal Source Imaging arXiv:2501.13109v1 Announce Type: cross Abstract: Inverse problems can be described as limited-data problems in which the signal of interest cannot be observed directly. A physics-based forward model that relates the signal with the observations is typically needed. Unfortunately, unknown model…

January 24, 2025
A dimensionality reduction technique based on the Gromov-Wasserstein distance

A dimensionality reduction technique based on the Gromov-Wasserstein distance arXiv:2501.13732v1 Announce Type: new Abstract: Analyzing relationships between objects is a pivotal problem within data science. In this context, Dimensionality reduction (DR) techniques are employed to generate smaller and more manageable data representations. This paper proposes a new method for dimensionality reduction, based on optimal transportation…

January 24, 2025
Ultralow-dimensionality reduction for identifying critical transitions by spatial-temporal PCA

Ultralow-dimensionality reduction for identifying critical transitions by spatial-temporal PCA arXiv:2501.12582v1 Announce Type: new Abstract: Discovering dominant patterns and exploring dynamic behaviors especially critical state transitions and tipping points in high-dimensional time-series data are challenging tasks in study of real-world complex systems, which demand interpretable data representations to facilitate comprehension of both spatial and temporal information…

January 23, 2025
Sequential Change Point Detection via Denoising Score Matching

Sequential Change Point Detection via Denoising Score Matching arXiv:2501.12667v1 Announce Type: new Abstract: Sequential change-point detection plays a critical role in numerous real-world applications, where timely identification of distributional shifts can greatly mitigate adverse outcomes. Classical methods commonly rely on parametric density assumptions of pre- and post-change distributions, limiting their effectiveness for high-dimensional, complex data…

January 23, 2025
On Generalization and Distributional Update for Mimicking Observations with Adequate Exploration

On Generalization and Distributional Update for Mimicking Observations with Adequate Exploration arXiv:2501.12785v1 Announce Type: new Abstract: This paper tackles the efficiency and stability issues in learning from observations (LfO). We commence by investigating how reward functions and policies generalize in LfO. Subsequently, the built-in reinforcement learning (RL) approach in generative adversarial imitation from observation (GAIfO)…

January 23, 2025
Singular leaning coefficients and efficiency in learning theory

Singular leaning coefficients and efficiency in learning theory arXiv:2501.12747v1 Announce Type: new Abstract: Singular learning models with non-positive Fisher information matrices include neural networks, reduced-rank regression, Boltzmann machines, normal mixture models, and others. These models have been widely used in the development of learning machines. However, theoretical analysis is still in its early stages. In…

January 23, 2025
Fixed-Budget Change Point Identification in Piecewise Constant Bandits

Fixed-Budget Change Point Identification in Piecewise Constant Bandits arXiv:2501.12957v1 Announce Type: new Abstract: We study the piecewise constant bandit problem where the expected reward is a piecewise constant function with one change point (discontinuity) across the action space $[0,1]$ and the learner’s aim is to locate the change point. Under the assumption of a fixed…

January 23, 2025
Extension of Symmetrized Neural Network Operators with Fractional and Mixed Activation Functions

Extension of Symmetrized Neural Network Operators with Fractional and Mixed Activation Functions arXiv:2501.10496v1 Announce Type: new Abstract: We propose a novel extension to symmetrized neural network operators by incorporating fractional and mixed activation functions. This study addresses the limitations of existing models in approximating higher-order smooth functions, particularly in complex and high-dimensional spaces. Our framework…

January 22, 2025
Simulation of Random LR Fuzzy Intervals

Simulation of Random LR Fuzzy Intervals arXiv:2501.10482v1 Announce Type: new Abstract: Random fuzzy variables join the modeling of the impreciseness (due to their “fuzzy part”) and randomness. Statistical samples of such objects are widely used, and their direct, numerically effective generation is therefore necessary. Usually, these samples consist of triangular or trapezoidal fuzzy numbers. In…

January 22, 2025
Multi-Output Conformal Regression: A Unified Comparative Study with New Conformity Scores

Multi-Output Conformal Regression: A Unified Comparative Study with New Conformity Scores arXiv:2501.10533v1 Announce Type: new Abstract: Quantifying uncertainty in multivariate regression is essential in many real-world applications, yet existing methods for constructing prediction regions often face limitations such as the inability to capture complex dependencies, lack of coverage guarantees, or high computational cost. Conformal prediction…

January 22, 2025
DPERC: Direct Parameter Estimation for Mixed Data

DPERC: Direct Parameter Estimation for Mixed Data arXiv:2501.10540v1 Announce Type: new Abstract: The covariance matrix is a foundation in numerous statistical and machine-learning applications such as Principle Component Analysis, Correlation Heatmap, etc. However, missing values within datasets present a formidable obstacle to accurately estimating this matrix. While imputation methods offer one avenue for addressing this…

January 22, 2025
Model-Robust and Adaptive-Optimal Transfer Learning for Tackling Concept Shifts in Nonparametric Regression

Model-Robust and Adaptive-Optimal Transfer Learning for Tackling Concept Shifts in Nonparametric Regression arXiv:2501.10870v1 Announce Type: new Abstract: When concept shifts and sample scarcity are present in the target domain of interest, nonparametric regression learners often struggle to generalize effectively. The technique of transfer learning remedies these issues by leveraging data or pre-trained models from similar…

January 22, 2025
SBAMDT: Bayesian Additive Decision Trees with Adaptive Soft Semi-multivariate Split Rules

SBAMDT: Bayesian Additive Decision Trees with Adaptive Soft Semi-multivariate Split Rules arXiv:2501.09900v1 Announce Type: new Abstract: Bayesian Additive Regression Trees [BART, Chipman et al., 2010] have gained significant popularity due to their remarkable predictive performance and ability to quantify uncertainty. However, standard decision tree models rely on recursive data splits at each decision node, using…

January 20, 2025
Tracking student skills real-time through a continuous-variable dynamic Bayesian network

Tracking student skills real-time through a continuous-variable dynamic Bayesian network arXiv:2501.10050v1 Announce Type: new Abstract: The field of Knowledge Tracing is focused on predicting the success rate of a student for a given skill. Modern methods like Deep Knowledge Tracing provide accurate estimates given enough data, but being based on neural networks they struggle to…

January 20, 2025
Statistical Inference for Sequential Feature Selection after Domain Adaptation

Statistical Inference for Sequential Feature Selection after Domain Adaptation arXiv:2501.09933v1 Announce Type: new Abstract: In high-dimensional regression, feature selection methods, such as sequential feature selection (SeqFS), are commonly used to identify relevant features. When data is limited, domain adaptation (DA) becomes crucial for transferring knowledge from a related source domain to a target domain, improving…

January 20, 2025
Contributions to the Decision Theoretic Foundations of Machine Learning and Robust Statistics under Weakly Structured Information

Contributions to the Decision Theoretic Foundations of Machine Learning and Robust Statistics under Weakly Structured Information arXiv:2501.10195v1 Announce Type: new Abstract: This habilitation thesis is cumulative and, therefore, is collecting and connecting research that I (together with several co-authors) have conducted over the last few years. Thus, the absolute core of the work is formed…

January 20, 2025
Provably Safeguarding a Classifier from OOD and Adversarial Samples: an Extreme Value Theory Approach

Provably Safeguarding a Classifier from OOD and Adversarial Samples: an Extreme Value Theory Approach arXiv:2501.10202v1 Announce Type: new Abstract: This paper introduces a novel method, Sample-efficient Probabilistic Detection using Extreme Value Theory (SPADE), which transforms a classifier into an abstaining classifier, offering provable protection against out-of-distribution and adversarial samples. The approach is based on a…

January 20, 2025
Generative Models with ELBOs Converging to Entropy Sums

Generative Models with ELBOs Converging to Entropy Sums arXiv:2501.09022v1 Announce Type: new Abstract: The evidence lower bound (ELBO) is one of the most central objectives for probabilistic unsupervised learning. For the ELBOs of several generative models and model classes, we here prove convergence to entropy sums. As one result, we provide a list of generative…

January 17, 2025
Estimating shared subspace with AJIVE: the power and limitation of multiple data matrices

Estimating shared subspace with AJIVE: the power and limitation of multiple data matrices arXiv:2501.09336v1 Announce Type: new Abstract: Integrative data analysis often requires disentangling joint and individual variations across multiple datasets, a challenge commonly addressed by the Joint and Individual Variation Explained (JIVE) model. While numerous methods have been developed to estimate the shared subspace…

January 17, 2025
On the convergence of noisy Bayesian Optimization with Expected Improvement

On the convergence of noisy Bayesian Optimization with Expected Improvement arXiv:2501.09262v1 Announce Type: new Abstract: Expected improvement (EI) is one of the most widely-used acquisition functions in Bayesian optimization (BO). Despite its proven success in applications for decades, important open questions remain on the theoretical convergence behaviors and rates for EI. In this paper, we…

January 17, 2025
Predictions as Surrogates: Revisiting Surrogate Outcomes in the Age of AI

Predictions as Surrogates: Revisiting Surrogate Outcomes in the Age of AI arXiv:2501.09731v1 Announce Type: new Abstract: We establish a formal connection between the decades-old surrogate outcome model in biostatistics and economics and the emerging field of prediction-powered inference (PPI). The connection treats predictions from pre-trained models, prevalent in the age of AI, as cost-effective surrogates…

January 17, 2025
Gradient Descent Converges Linearly to Flatter Minima than Gradient Flow in Shallow Linear Networks

Gradient Descent Converges Linearly to Flatter Minima than Gradient Flow in Shallow Linear Networks arXiv:2501.09137v1 Announce Type: cross Abstract: We study the gradient descent (GD) dynamics of a depth-2 linear neural network with a single input and output. We show that GD converges at an explicit linear rate to a global minimum of the training…

January 17, 2025
A Constant Velocity Latent Dynamics Approach for Accelerating Simulation of Stiff Nonlinear Systems

A Constant Velocity Latent Dynamics Approach for Accelerating Simulation of Stiff Nonlinear Systems arXiv:2501.08423v1 Announce Type: new Abstract: Solving stiff ordinary differential equations (StODEs) requires sophisticated numerical solvers, which are often computationally expensive. In particular, StODE’s often cannot be solved with traditional explicit time integration schemes and one must resort to costly implicit methods to…

January 16, 2025
Causal vs. Anticausal merging of predictors

Causal vs. Anticausal merging of predictors arXiv:2501.08426v1 Announce Type: cross Abstract: We study the differences arising from merging predictors in the causal and anticausal directions using the same data. In particular we study the asymmetries that arise in a simple model where we merge the predictors using one binary variable as target and two continuous…

January 16, 2025
A Theory of Optimistically Universal Online Learnability for General Concept Classes

A Theory of Optimistically Universal Online Learnability for General Concept Classes arXiv:2501.08551v1 Announce Type: new Abstract: We provide a full characterization of the concept classes that are optimistically universally online learnable with ${0, 1}$ labels. The notion of optimistically universal online learning was defined in [Hanneke, 2021] in order to understand learnability under minimal assumptions.…

January 16, 2025
Quantum Reservoir Computing and Risk Bounds

Quantum Reservoir Computing and Risk Bounds arXiv:2501.08640v1 Announce Type: cross Abstract: We propose a way to bound the generalisation errors of several classes of quantum reservoirs using the Rademacher complexity. We give specific, parameter-dependent bounds for two particular quantum reservoir classes. We analyse how the generalisation bounds scale with growing numbers of qubits. Applying our…

January 16, 2025
Diagonal Over-parameterization in Reproducing Kernel Hilbert Spaces as an Adaptive Feature Model: Generalization and Adaptivity

Diagonal Over-parameterization in Reproducing Kernel Hilbert Spaces as an Adaptive Feature Model: Generalization and Adaptivity arXiv:2501.08679v1 Announce Type: cross Abstract: This paper introduces a diagonal adaptive kernel model that dynamically learns kernel eigenvalues and output coefficients simultaneously during training. Unlike fixed-kernel methods tied to the neural tangent kernel theory, the diagonal adaptive kernel model adapts…

January 16, 2025
Concentration of Measure for Distributions Generated via Diffusion Models

Concentration of Measure for Distributions Generated via Diffusion Models arXiv:2501.07741v1 Announce Type: new Abstract: We show via a combination of mathematical arguments and empirical evidence that data distributions sampled from diffusion models satisfy a Concentration of Measure Property saying that any Lipschitz $1$-dimensional projection of a random vector is not too far from its mean…

January 15, 2025
On the use of Statistical Learning Theory for model selection in Structural Health Monitoring

On the use of Statistical Learning Theory for model selection in Structural Health Monitoring arXiv:2501.08050v1 Announce Type: new Abstract: Whenever data-based systems are employed in engineering applications, defining an optimal statistical representation is subject to the problem of model selection. This paper focusses on how well models can generalise in Structural Health Monitoring (SHM). Although…

January 15, 2025
On the Statistical Capacity of Deep Generative Models

On the Statistical Capacity of Deep Generative Models arXiv:2501.07763v1 Announce Type: new Abstract: Deep generative models are routinely used in generating samples from complex, high-dimensional distributions. Despite their apparent successes, their statistical properties are not well understood. A common assumption is that with enough training data and sufficiently large neural networks, deep generative model samples…

January 15, 2025
Globally Convergent Variational Inference

Globally Convergent Variational Inference arXiv:2501.08201v1 Announce Type: new Abstract: In variational inference (VI), an approximation of the posterior distribution is selected from a family of distributions through numerical optimization. With the most common variational objective function, known as the evidence lower bound (ELBO), only convergence to a local optimum can be guaranteed. In this work,…

January 15, 2025
Avoiding subtraction and division of stochastic signals using normalizing flows: NFdeconvolve

Avoiding subtraction and division of stochastic signals using normalizing flows: NFdeconvolve arXiv:2501.08288v1 Announce Type: new Abstract: Across the scientific realm, we find ourselves subtracting or dividing stochastic signals. For instance, consider a stochastic realization, $x$, generated from the addition or multiplication of two stochastic signals $a$ and $b$, namely $x=a+b$ or $x = ab$. For…

January 15, 2025
Counterfactually Fair Reinforcement Learning via Sequential Data Preprocessing

Counterfactually Fair Reinforcement Learning via Sequential Data Preprocessing arXiv:2501.06366v1 Announce Type: new Abstract: When applied in healthcare, reinforcement learning (RL) seeks to dynamically match the right interventions to subjects to maximize population benefit. However, the learned policy may disproportionately allocate efficacious actions to one subpopulation, creating or exacerbating disparities in other socioeconomically-disadvantaged subgroups. These biases…

January 14, 2025
Computational and Statistical Asymptotic Analysis of the JKO Scheme for Iterative Algorithms to update distributions

Computational and Statistical Asymptotic Analysis of the JKO Scheme for Iterative Algorithms to update distributions arXiv:2501.06408v1 Announce Type: new Abstract: The seminal paper of Jordan, Kinderlehrer, and Otto introduced what is now widely known as the JKO scheme, an iterative algorithmic framework for computing distributions. This scheme can be interpreted as a Wasserstein gradient flow…

January 14, 2025
Variable Selection Methods for Multivariate, Functional, and Complex Biomedical Data in the AI Age

Variable Selection Methods for Multivariate, Functional, and Complex Biomedical Data in the AI Age arXiv:2501.06868v1 Announce Type: new Abstract: Many problems within personalized medicine and digital health rely on the analysis of continuous-time functional biomarkers and other complex data structures emerging from high-resolution patient monitoring. In this context, this work proposes new optimization-based variable selection…

January 14, 2025
Dynamic Causal Structure Discovery and Causal Effect Estimation

Dynamic Causal Structure Discovery and Causal Effect Estimation arXiv:2501.06534v1 Announce Type: new Abstract: To represent the causal relationships between variables, a directed acyclic graph (DAG) is widely utilized in many areas, such as social sciences, epidemics, and genetics. Many causal structure learning approaches are developed to learn the hidden causal structure utilizing deep-learning approaches. However,…

January 14, 2025
Automatic Double Reinforcement Learning in Semiparametric Markov Decision Processes with Applications to Long-Term Causal Inference

Automatic Double Reinforcement Learning in Semiparametric Markov Decision Processes with Applications to Long-Term Causal Inference arXiv:2501.06926v1 Announce Type: new Abstract: Double reinforcement learning (DRL) enables statistically efficient inference on the value of a policy in a nonparametric Markov Decision Process (MDP) given trajectories generated by another policy. However, this approach necessarily requires stringent overlap between…

January 14, 2025
Covariate Dependent Mixture of Bayesian Networks

Covariate Dependent Mixture of Bayesian Networks arXiv:2501.05745v1 Announce Type: new Abstract: Learning the structure of Bayesian networks from data provides insights into underlying processes and the causal relationships that generate the data, but its usefulness depends on the homogeneity of the data population, a condition often violated in real-world applications. In such cases, using a…

January 13, 2025
Outlyingness Scores with Cluster Catch Digraphs

Outlyingness Scores with Cluster Catch Digraphs arXiv:2501.05530v1 Announce Type: new Abstract: This paper introduces two novel, outlyingness scores (OSs) based on Cluster Catch Digraphs (CCDs): Outbound Outlyingness Score (OOS) and Inbound Outlyingness Score (IOS). These scores enhance the interpretability of outlier detection results. Both OSs employ graph-, density-, and distribution-based techniques, tailored to high-dimensional data…

January 13, 2025
Analog Bayesian neural networks are insensitive to the shape of the weight distribution

Analog Bayesian neural networks are insensitive to the shape of the weight distribution arXiv:2501.05564v1 Announce Type: cross Abstract: Recent work has demonstrated that Bayesian neural networks (BNN’s) trained with mean field variational inference (MFVI) can be implemented in analog hardware, promising orders of magnitude energy savings compared to the standard digital implementations. However, while Gaussians…

January 13, 2025
rmlnomogram: An R package to construct an explainable nomogram for any machine learning algorithms

rmlnomogram: An R package to construct an explainable nomogram for any machine learning algorithms arXiv:2501.05772v1 Announce Type: cross Abstract: Background: Current nomogram can only be created for regression algorithm. Providing nomogram for any machine learning (ML) algorithms may accelerate model deployment in clinical settings or improve model availability. We developed an R package and web…

January 13, 2025
Random Sparse Lifts: Construction, Analysis and Convergence of finite sparse networks

Random Sparse Lifts: Construction, Analysis and Convergence of finite sparse networks arXiv:2501.05930v1 Announce Type: cross Abstract: We present a framework to define a large class of neural networks for which, by construction, training by gradient flow provably reaches arbitrarily low loss when the number of parameters grows. Distinct from the fixed-space global optimality of non-convex…

January 13, 2025
Deep Transfer $Q$-Learning for Offline Non-Stationary Reinforcement Learning

Deep Transfer $Q$-Learning for Offline Non-Stationary Reinforcement Learning arXiv:2501.04870v1 Announce Type: new Abstract: In dynamic decision-making scenarios across business and healthcare, leveraging sample trajectories from diverse populations can significantly enhance reinforcement learning (RL) performance for specific target populations, especially when sample sizes are limited. While existing transfer learning methods primarily focus on linear regression settings,…

January 10, 2025
RieszBoost: Gradient Boosting for Riesz Regression

RieszBoost: Gradient Boosting for Riesz Regression arXiv:2501.04871v1 Announce Type: new Abstract: Answering causal questions often involves estimating linear functionals of conditional expectations, such as the average treatment effect or the effect of a longitudinal modified treatment policy. By the Riesz representation theorem, these functionals can be expressed as the expected product of the conditional expectation…

January 10, 2025
Towards understanding the bias in decision trees

Towards understanding the bias in decision trees arXiv:2501.04903v1 Announce Type: new Abstract: There is a widespread and longstanding belief that machine learning models are biased towards the majority (or negative) class when learning from imbalanced data, leading them to neglect or ignore the minority (or positive) class. In this study, we show that this belief…

January 10, 2025
Optimality and Adaptivity of Deep Neural Features for Instrumental Variable Regression

Optimality and Adaptivity of Deep Neural Features for Instrumental Variable Regression arXiv:2501.04898v1 Announce Type: new Abstract: We provide a convergence analysis of deep feature instrumental variable (DFIV) regression (Xu et al., 2021), a nonparametric approach to IV regression using data-adaptive features learned by deep neural networks in two stages. We prove that the DFIV algorithm…

January 10, 2025
Non-asymptotic analysis of the performance of the penalized least trimmed squares in sparse models

Non-asymptotic analysis of the performance of the penalized least trimmed squares in sparse models arXiv:2501.04946v1 Announce Type: new Abstract: The least trimmed squares (LTS) estimator is a renowned robust alternative to the classic least squares estimator and is popular in location, regression, machine learning, and AI literature. Many studies exist on LTS, including its robustness,…

January 10, 2025
Mixing Times and Privacy Analysis for the Projected Langevin Algorithm under a Modulus of Continuity

Mixing Times and Privacy Analysis for the Projected Langevin Algorithm under a Modulus of Continuity arXiv:2501.04134v1 Announce Type: new Abstract: We study the mixing time of the projected Langevin algorithm (LA) and the privacy curve of noisy Stochastic Gradient Descent (SGD), beyond nonexpansive iterations. Specifically, we derive new mixing time bounds for the projected LA…

January 9, 2025
Generation from Noisy Examples

Generation from Noisy Examples arXiv:2501.04179v1 Announce Type: new Abstract: We continue to study the learning-theoretic foundations of generation by extending the results from Kleinberg and Mullainathan [2024] and Li et al. [2024] to account for noisy example streams. In the noiseless setting of Kleinberg and Mullainathan [2024] and Li et al. [2024], an adversary picks…

January 9, 2025
Statistical Uncertainty Quantification for Aggregate Performance Metrics in Machine Learning Benchmarks

Statistical Uncertainty Quantification for Aggregate Performance Metrics in Machine Learning Benchmarks arXiv:2501.04234v1 Announce Type: new Abstract: Modern artificial intelligence is supported by machine learning models (e.g., foundation models) that are pretrained on a massive data corpus and then adapted to solve a variety of downstream tasks. To summarize performance across multiple tasks, evaluation metrics are…

January 9, 2025
Circuit Complexity Bounds for Visual Autoregressive Model

Circuit Complexity Bounds for Visual Autoregressive Model arXiv:2501.04299v1 Announce Type: new Abstract: Understanding the expressive ability of a specific model is essential for grasping its capacity limitations. Recently, several studies have established circuit complexity bounds for Transformer architecture. Besides, the Visual AutoRegressive (VAR) model has risen to be a prominent method in the field of…

January 9, 2025
On weight and variance uncertainty in neural networks for regression tasks

On weight and variance uncertainty in neural networks for regression tasks arXiv:2501.04272v1 Announce Type: new Abstract: We consider the problem of weight uncertainty proposed by [Blundell et al. (2015). Weight uncertainty in neural network. In International conference on machine learning, 1613-1622, PMLR.] in neural networks {(NNs)} specialized for regression tasks. {We further} investigate the effect…

January 9, 2025
Class-Balance Bias in Regularized Regression

Class-Balance Bias in Regularized Regression arXiv:2501.03821v1 Announce Type: new Abstract: Regularized models are often sensitive to the scales of the features in the data and it has therefore become standard practice to normalize (center and scale) the features before fitting the model. But there are many different ways to normalize the features and the choice…

January 8, 2025
Structure-Preference Enabled Graph Embedding Generation under Differential Privacy

Structure-Preference Enabled Graph Embedding Generation under Differential Privacy arXiv:2501.03451v1 Announce Type: new Abstract: Graph embedding generation techniques aim to learn low-dimensional vectors for each node in a graph and have recently gained increasing research attention. Publishing low-dimensional node vectors enables various graph analysis tasks, such as structural equivalence and link prediction. Yet, improper publication opens…

January 8, 2025
Coupled Hierarchical Structure Learning using Tree-Wasserstein Distance

Coupled Hierarchical Structure Learning using Tree-Wasserstein Distance arXiv:2501.03627v1 Announce Type: cross Abstract: In many applications, both data samples and features have underlying hierarchical structures. However, existing methods for learning these latent structures typically focus on either samples or features, ignoring possible coupling between them. In this paper, we introduce a coupled hierarchical structure learning method…

January 8, 2025
Deep Networks are Reproducing Kernel Chains

Deep Networks are Reproducing Kernel Chains arXiv:2501.03697v1 Announce Type: cross Abstract: Identifying an appropriate function space for deep neural networks remains a key open question. While shallow neural networks are naturally associated with Reproducing Kernel Banach Spaces (RKBS), deep networks present unique challenges. In this work, we extend RKBS to chain RKBS (cRKBS), a new…

January 8, 2025
Symmetry and Generalisation in Machine Learning

Symmetry and Generalisation in Machine Learning arXiv:2501.03858v1 Announce Type: cross Abstract: This work is about understanding the impact of invariance and equivariance on generalisation in supervised learning. We use the perspective afforded by an averaging operator to show that for any predictor that is not equivariant, there is an equivariant predictor with strictly lower test…

January 8, 2025
Modeling COVID-19 spread in the USA using metapopulation SIR models coupled with graph convolutional neural networks

Modeling COVID-19 spread in the USA using metapopulation SIR models coupled with graph convolutional neural networks arXiv:2501.02043v1 Announce Type: new Abstract: Graph convolutional neural networks (GCNs) have shown tremendous promise in addressing data-intensive challenges in recent years. In particular, some attempts have been made to improve predictions of Susceptible-Infected-Recovered (SIR) models by incorporating human mobility…

January 7, 2025
Majorization-Minimization Dual Stagewise Algorithm for Generalized Lasso

Majorization-Minimization Dual Stagewise Algorithm for Generalized Lasso arXiv:2501.02197v1 Announce Type: new Abstract: The generalized lasso is a natural generalization of the celebrated lasso approach to handle structural regularization problems. Many important methods and applications fall into this framework, including fused lasso, clustered lasso, and constrained lasso. To elevate its effectiveness in large-scale problems, extensive research…

January 7, 2025
Beyond Log-Concavity and Score Regularity: Improved Convergence Bounds for Score-Based Generative Models in W2-distance

Beyond Log-Concavity and Score Regularity: Improved Convergence Bounds for Score-Based Generative Models in W2-distance arXiv:2501.02298v1 Announce Type: new Abstract: Score-based Generative Models (SGMs) aim to sample from a target distribution by learning score functions using samples perturbed by Gaussian noise. Existing convergence bounds for SGMs in the $mathcal{W}_2$-distance rely on stringent assumptions about the data…

January 7, 2025
Robust Multi-Dimensional Scaling via Accelerated Alternating Projections

Robust Multi-Dimensional Scaling via Accelerated Alternating Projections arXiv:2501.02208v1 Announce Type: new Abstract: We consider the robust multi-dimensional scaling (RMDS) problem in this paper. The goal is to localize point locations from pairwise distances that may be corrupted by outliers. Inspired by classic MDS theories, and nonconvex works for the robust principal component analysis (RPCA) problem,…

January 7, 2025
Who Wrote This? Zero-Shot Statistical Tests for LLM-Generated Text Detection using Finite Sample Concentration Inequalities

Who Wrote This? Zero-Shot Statistical Tests for LLM-Generated Text Detection using Finite Sample Concentration Inequalities arXiv:2501.02406v1 Announce Type: new Abstract: Verifying the provenance of content is crucial to the function of many organizations, e.g., educational institutions, social media platforms, firms, etc. This problem is becoming increasingly difficult as text generated by Large Language Models (LLMs)…

January 7, 2025
Guaranteed Nonconvex Low-Rank Tensor Estimation via Scaled Gradient Descent

Guaranteed Nonconvex Low-Rank Tensor Estimation via Scaled Gradient Descent arXiv:2501.01696v1 Announce Type: new Abstract: Tensors, which give a faithful and effective representation to deliver the intrinsic structure of multi-dimensional data, play a crucial role in an increasing number of signal processing and machine learning problems. However, tensor data are often accompanied by arbitrary signal corruptions,…

January 6, 2025
Signal Recovery Using a Spiked Mixture Model

Signal Recovery Using a Spiked Mixture Model arXiv:2501.01840v1 Announce Type: new Abstract: We introduce the spiked mixture model (SMM) to address the problem of estimating a set of signals from many randomly scaled and noisy observations. Subsequently, we design a novel expectation-maximization (EM) algorithm to recover all parameters of the SMM. Numerical experiments show that…

January 6, 2025
Unified Native Spaces in Kernel Methods

Unified Native Spaces in Kernel Methods arXiv:2501.01825v1 Announce Type: new Abstract: There exists a plethora of parametric models for positive definite kernels, and their use is ubiquitous in disciplines as diverse as statistics, machine learning, numerical analysis, and approximation theory. Usually, the kernel parameters index certain features of an associated process. Amongst those features, smoothness…

January 6, 2025
Transfer Neyman-Pearson Algorithm for Outlier Detection

Transfer Neyman-Pearson Algorithm for Outlier Detection arXiv:2501.01525v1 Announce Type: cross Abstract: We consider the problem of transfer learning in outlier detection where target abnormal data is rare. While transfer learning has been considered extensively in traditional balanced classification, the problem of transfer in outlier detection and more generally in imbalanced classification settings has received less…

January 6, 2025
Many of Your DPOs are Secretly One: Attempting Unification Through Mutual Information

Many of Your DPOs are Secretly One: Attempting Unification Through Mutual Information arXiv:2501.01544v1 Announce Type: cross Abstract: Post-alignment of large language models (LLMs) is critical in improving their utility, safety, and alignment with human intentions. Direct preference optimisation (DPO) has become one of the most widely used algorithms for achieving this alignment, given its ability…

January 6, 2025
Post Launch Evaluation of Policies in a High-Dimensional Setting

Post Launch Evaluation of Policies in a High-Dimensional Setting arXiv:2501.00119v1 Announce Type: new Abstract: A/B tests, also known as randomized controlled experiments (RCTs), are the gold standard for evaluating the impact of new policies, products, or decisions. However, these tests can be costly in terms of time and resources, potentially exposing users, customers, or other…

January 3, 2025
Efficient Human-in-the-Loop Active Learning: A Novel Framework for Data Labeling in AI Systems

Efficient Human-in-the-Loop Active Learning: A Novel Framework for Data Labeling in AI Systems arXiv:2501.00277v1 Announce Type: new Abstract: Modern AI algorithms require labeled data. In real world, majority of data are unlabeled. Labeling the data are costly. this is particularly true for some areas requiring special skills, such as reading radiology images by physicians. To…

January 3, 2025
Different thresholding methods on Nearest Shrunken Centroid algorithm

Different thresholding methods on Nearest Shrunken Centroid algorithm arXiv:2501.00632v1 Announce Type: new Abstract: This article considers the impact of different thresholding methods to the Nearest Shrunken Centroid algorithm, which is popularly referred as the Prediction Analysis of Microarrays (PAM) for high-dimensional classification. PAM uses soft thresholding to achieve high computational efficiency and high classification accuracy…

January 3, 2025
A Distributional Evaluation of Generative Image Models

A Distributional Evaluation of Generative Image Models arXiv:2501.00744v1 Announce Type: new Abstract: Generative models are ubiquitous in modern artificial intelligence (AI) applications. Recent advances have led to a variety of generative modeling approaches that are capable of synthesizing highly realistic samples. Despite these developments, evaluating the distributional match between the synthetic samples and the target…

January 3, 2025
Ensuring superior learning outcomes and data security for authorized learner

Ensuring superior learning outcomes and data security for authorized learner arXiv:2501.00754v1 Announce Type: new Abstract: The learner’s ability to generate a hypothesis that closely approximates the target function is crucial in machine learning. Achieving this requires sufficient data; however, unauthorized access by an eavesdropping learner can lead to security risks. Thus, it is important to…

January 3, 2025
Surrogate Modeling for Explainable Predictive Time Series Corrections

Surrogate Modeling for Explainable Predictive Time Series Corrections arXiv:2412.19897v1 Announce Type: new Abstract: We introduce a local surrogate approach for explainable time-series forecasting. An initially non-interpretable predictive model to improve the forecast of a classical time-series ‘base model’ is used. ‘Explainability’ of the correction is provided by fitting the base model again to the data…

December 31, 2024
Deep Generalized Schr”odinger Bridges: From Image Generation to Solving Mean-Field Games

Deep Generalized Schr”odinger Bridges: From Image Generation to Solving Mean-Field Games arXiv:2412.20279v1 Announce Type: new Abstract: Generalized Schr”odinger Bridges (GSBs) are a fundamental mathematical framework used to analyze the most likely particle evolution based on the principle of least action including kinetic and potential energy. In parallel to their well-established presence in the theoretical realms…

December 31, 2024
Confidence Interval Construction and Conditional Variance Estimation with Dense ReLU Networks

Confidence Interval Construction and Conditional Variance Estimation with Dense ReLU Networks arXiv:2412.20355v1 Announce Type: new Abstract: This paper addresses the problems of conditional variance estimation and confidence interval construction in nonparametric regression using dense networks with the Rectified Linear Unit (ReLU) activation function. We present a residual-based framework for conditional variance estimation, deriving nonasymptotic bounds…

December 31, 2024
Distributionally Robust Optimization via Iterative Algorithms in Continuous Probability Spaces

Distributionally Robust Optimization via Iterative Algorithms in Continuous Probability Spaces arXiv:2412.20556v1 Announce Type: new Abstract: We consider a minimax problem motivated by distributionally robust optimization (DRO) when the worst-case distribution is continuous, leading to significant computational challenges due to the infinite-dimensional nature of the optimization problem. Recent research has explored learning the worst-case distribution using…

December 31, 2024
Testing and Improving the Robustness of Amortized Bayesian Inference for Cognitive Models

Testing and Improving the Robustness of Amortized Bayesian Inference for Cognitive Models arXiv:2412.20586v1 Announce Type: new Abstract: Contaminant observations and outliers often cause problems when estimating the parameters of cognitive models, which are statistical models representing cognitive processes. In this study, we test and improve the robustness of parameter estimation using amortized Bayesian inference (ABI)…

December 31, 2024
Neural Networks Perform Sufficient Dimension Reduction

Neural Networks Perform Sufficient Dimension Reduction arXiv:2412.19033v1 Announce Type: new Abstract: This paper investigates the connection between neural networks and sufficient dimension reduction (SDR), demonstrating that neural networks inherently perform SDR in regression tasks under appropriate rank regularizations. Specifically, the weights in the first layer span the central mean subspace. We establish the statistical consistency…

December 30, 2024
Adaptive Conformal Inference by Betting

Adaptive Conformal Inference by Betting arXiv:2412.19318v1 Announce Type: new Abstract: Conformal prediction is a valuable tool for quantifying predictive uncertainty of machine learning models. However, its applicability relies on the assumption of data exchangeability, a condition which is often not met in real-world scenarios. In this paper, we consider the problem of adaptive conformal inference…

December 30, 2024
Localized exploration in contextual dynamic pricing achieves dimension-free regret

Localized exploration in contextual dynamic pricing achieves dimension-free regret arXiv:2412.19252v1 Announce Type: new Abstract: We study the problem of contextual dynamic pricing with a linear demand model. We propose a novel localized exploration-then-commit (LetC) algorithm which starts with a pure exploration stage, followed by a refinement stage that explores near the learned optimal pricing policy,…

December 30, 2024
Asymptotically Optimal Search for a Change Point Anomaly under a Composite Hypothesis Model

Asymptotically Optimal Search for a Change Point Anomaly under a Composite Hypothesis Model arXiv:2412.19392v1 Announce Type: new Abstract: We address the problem of searching for a change point in an anomalous process among a finite set of M processes. Specifically, we address a composite hypothesis model in which each process generates measurements following a common…

December 30, 2024
Low-Rank Contextual Reinforcement Learning from Heterogeneous Human Feedback

Low-Rank Contextual Reinforcement Learning from Heterogeneous Human Feedback arXiv:2412.19436v1 Announce Type: new Abstract: Reinforcement learning from human feedback (RLHF) has become a cornerstone for aligning large language models with human preferences. However, the heterogeneity of human feedback, driven by diverse individual contexts and preferences, poses significant challenges for reward learning. To address this, we propose…

December 30, 2024
Data-Driven Priors in the Maximum Entropy on the Mean Method for Linear Inverse Problems

Data-Driven Priors in the Maximum Entropy on the Mean Method for Linear Inverse Problems arXiv:2412.17916v1 Announce Type: new Abstract: We establish the theoretical framework for implementing the maximumn entropy on the mean (MEM) method for linear inverse problems in the setting of approximate (data-driven) priors. We prove a.s. convergence for empirical means and further develop…

December 25, 2024
An information theoretic limit to data amplification

An information theoretic limit to data amplification arXiv:2412.18041v1 Announce Type: new Abstract: In recent years generative artificial intelligence has been used to create data to support science analysis. For example, Generative Adversarial Networks (GANs) have been trained using Monte Carlo simulated input and then used to generate data for the same problem. This has the…

December 25, 2024
Fr’echet regression for multi-label feature selection with implicit regularization

Fr’echet regression for multi-label feature selection with implicit regularization arXiv:2412.18247v1 Announce Type: new Abstract: Fr’echet regression extends linear regression to model complex responses in metric spaces, making it particularly relevant for multi-label regression, where each instance can have multiple associated labels. However, variable selection within this framework remains underexplored. In this paper, we pro pose…

December 25, 2024
Heterogeneous transfer learning for high dimensional regression with feature mismatch

Heterogeneous transfer learning for high dimensional regression with feature mismatch arXiv:2412.18081v1 Announce Type: new Abstract: We consider the problem of transferring knowledge from a source, or proxy, domain to a new target domain for learning a high-dimensional regression model with possibly different features. Recently, the statistical properties of homogeneous transfer learning have been investigated. However,…

December 25, 2024
A Statistical Framework for Ranking LLM-Based Chatbots

A Statistical Framework for Ranking LLM-Based Chatbots arXiv:2412.18407v1 Announce Type: new Abstract: Large language models (LLMs) have transformed natural language processing, with frameworks like Chatbot Arena providing pioneering platforms for evaluating these models. By facilitating millions of pairwise comparisons based on human judgments, Chatbot Arena has become a cornerstone in LLM evaluation, offering rich datasets…

December 25, 2024
Robust random graph matching in dense graphs via vector approximate message passing

Robust random graph matching in dense graphs via vector approximate message passing arXiv:2412.16457v1 Announce Type: new Abstract: In this paper, we focus on the matching recovery problem between a pair of correlated Gaussian Wigner matrices with a latent vertex correspondence. We are particularly interested in a robust version of this problem such that our observation…

December 24, 2024
Fast Multi-Group Gaussian Process Factor Models

Fast Multi-Group Gaussian Process Factor Models arXiv:2412.16773v1 Announce Type: new Abstract: Gaussian processes are now commonly used in dimensionality reduction approaches tailored to neuroscience, especially to describe changes in high-dimensional neural activity over time. As recording capabilities expand to include neuronal populations across multiple brain areas, cortical layers, and cell types, interest in extending Gaussian…

December 24, 2024
Gradient-Based Non-Linear Inverse Learning

Gradient-Based Non-Linear Inverse Learning arXiv:2412.16794v1 Announce Type: new Abstract: We study statistical inverse learning in the context of nonlinear inverse problems under random design. Specifically, we address a class of nonlinear problems by employing gradient descent (GD) and stochastic gradient descent (SGD) with mini-batching, both using constant step sizes. Our analysis derives convergence rates for…

December 24, 2024
Integrating Random Effects in Variational Autoencoders for Dimensionality Reduction of Correlated Data

Integrating Random Effects in Variational Autoencoders for Dimensionality Reduction of Correlated Data arXiv:2412.16899v1 Announce Type: new Abstract: Variational Autoencoders (VAE) are widely used for dimensionality reduction of large-scale tabular and image datasets, under the assumption of independence between data observations. In practice, however, datasets are often correlated, with typical sources of correlation including spatial, temporal…

December 24, 2024
Learning from Summarized Data: Gaussian Process Regression with Sample Quasi-Likelihood

Learning from Summarized Data: Gaussian Process Regression with Sample Quasi-Likelihood arXiv:2412.17455v1 Announce Type: new Abstract: Gaussian process regression is a powerful Bayesian nonlinear regression method. Recent research has enabled the capture of many types of observations using non-Gaussian likelihoods. To deal with various tasks in spatial modeling, we benefit from this development. Difficulties still arise…

December 24, 2024
Enhancing Masked Time-Series Modeling via Dropping Patches

Enhancing Masked Time-Series Modeling via Dropping Patches arXiv:2412.15315v1 Announce Type: new Abstract: This paper explores how to enhance existing masked time-series modeling by randomly dropping sub-sequence level patches of time series. On this basis, a simple yet effective method named DropPatch is proposed, which has two remarkable advantages: 1) It improves the pre-training efficiency by…

December 23, 2024
Deep learning joint extremes of metocean variables using the SPAR model

Deep learning joint extremes of metocean variables using the SPAR model arXiv:2412.15808v1 Announce Type: new Abstract: This paper presents a novel deep learning framework for estimating multivariate joint extremes of metocean variables, based on the Semi-Parametric Angular-Radial (SPAR) model. When considered in polar coordinates, the problem of modelling multivariate extremes is transformed to one of…

December 23, 2024
Using matrix-product states for time-series machine learning

Using matrix-product states for time-series machine learning arXiv:2412.15826v1 Announce Type: new Abstract: Matrix-product states (MPS) have proven to be a versatile ansatz for modeling quantum many-body physics. For many applications, and particularly in one-dimension, they capture relevant quantum correlations in many-body wavefunctions while remaining tractable to store and manipulate on a classical computer. This has…

December 23, 2024
On Robust Cross Domain Alignment

On Robust Cross Domain Alignment arXiv:2412.15861v1 Announce Type: new Abstract: The Gromov-Wasserstein (GW) distance is an effective measure of alignment between distributions supported on distinct ambient spaces. Calculating essentially the mutual departure from isometry, it has found vast usage in domain translation and network analysis. It has long been shown to be vulnerable to contamination…

December 23, 2024
Learning sparsity-promoting regularizers for linear inverse problems

Learning sparsity-promoting regularizers for linear inverse problems arXiv:2412.16031v1 Announce Type: new Abstract: This paper introduces a novel approach to learning sparsity-promoting regularizers for solving linear inverse problems. We develop a bilevel optimization framework to select an optimal synthesis operator, denoted as $B$, which regularizes the inverse problem while promoting sparsity in the solution. The method…

December 23, 2024