Category: stat.ML

  • Statistical Undersampling with Mutual Information and Support Points

    Statistical Undersampling with Mutual Information and Support Points arXiv:2412.14527v1 Announce Type: new Abstract: Class imbalance and distributional differences in large datasets present significant challenges for classification tasks machine learning, often leading to biased models and poor predictive performance for minority classes. This work introduces two novel undersampling approaches: mutual information-based stratified simple random sampling and…

  • On the Robustness of Spectral Algorithms for Semirandom Stochastic Block Models

    On the Robustness of Spectral Algorithms for Semirandom Stochastic Block Models arXiv:2412.14315v1 Announce Type: new Abstract: In a graph bisection problem, we are given a graph $G$ with two equally-sized unlabeled communities, and the goal is to recover the vertices in these communities. A popular heuristic, known as spectral clustering, is to output an estimated…

  • From Point to probabilistic gradient boosting for claim frequency and severity prediction

    From Point to probabilistic gradient boosting for claim frequency and severity prediction arXiv:2412.14916v1 Announce Type: new Abstract: Gradient boosting for decision tree algorithms are increasingly used in actuarial applications as they show superior predictive performance over traditional generalized linear models. Many improvements and sophistications to the first gradient boosting machine algorithm exist. We present in…

  • FedSTaS: Client Stratification and Client Level Sampling for Efficient Federated Learning

    FedSTaS: Client Stratification and Client Level Sampling for Efficient Federated Learning arXiv:2412.14226v1 Announce Type: cross Abstract: Federated learning (FL) is a machine learning methodology that involves the collaborative training of a global model across multiple decentralized clients in a privacy-preserving way. Several FL methods are introduced to tackle communication inefficiencies but do not address how…

  • Projected gradient methods for nonconvex and stochastic optimization: new complexities and auto-conditioned stepsizes

    Projected gradient methods for nonconvex and stochastic optimization: new complexities and auto-conditioned stepsizes arXiv:2412.14291v1 Announce Type: cross Abstract: We present a novel class of projected gradient (PG) methods for minimizing a smooth but not necessarily convex function over a convex compact set. We first provide a novel analysis of the “vanilla” PG method, achieving the…

  • Time-Reversible Bridges of Data with Machine Learning

    Time-Reversible Bridges of Data with Machine Learning arXiv:2412.13665v1 Announce Type: new Abstract: The analysis of dynamical systems is a fundamental tool in the natural sciences and engineering. It is used to understand the evolution of systems as large as entire galaxies and as small as individual molecules. With predefined conditions on the evolution of dy-namical…

  • jinns: a JAX Library for Physics-Informed Neural Networks

    jinns: a JAX Library for Physics-Informed Neural Networks arXiv:2412.14132v1 Announce Type: new Abstract: jinns is an open-source Python library for physics-informed neural networks, built to tackle both forward and inverse problems, as well as meta-model learning. Rooted in the JAX ecosystem, it provides a versatile framework for efficiently prototyping real-problems, while easily allowing extensions to…

  • Preconditioned Subspace Langevin Monte Carlo

    Preconditioned Subspace Langevin Monte Carlo arXiv:2412.13928v1 Announce Type: new Abstract: We develop a new efficient method for high-dimensional sampling called Subspace Langevin Monte Carlo. The primary application of these methods is to efficiently implement Preconditioned Langevin Monte Carlo. To demonstrate the usefulness of this new method, we extend ideas from subspace descent methods in Euclidean…

  • Adaptive Nonparametric Perturbations of Parametric Bayesian Models

    Adaptive Nonparametric Perturbations of Parametric Bayesian Models arXiv:2412.10683v2 Announce Type: cross Abstract: Parametric Bayesian modeling offers a powerful and flexible toolbox for scientific data analysis. Yet the model, however detailed, may still be wrong, and this can make inferences untrustworthy. In this paper we study nonparametrically perturbed parametric (NPP) Bayesian models, in which a parametric…

  • Deep Learning for Hydroelectric Optimization: Generating Long-Term River Discharge Scenarios with Ensemble Forecasts from Global Circulation Models

    Deep Learning for Hydroelectric Optimization: Generating Long-Term River Discharge Scenarios with Ensemble Forecasts from Global Circulation Models arXiv:2412.12234v1 Announce Type: cross Abstract: Hydroelectric power generation is a critical component of the global energy matrix, particularly in countries like Brazil, where it represents the majority of the energy supply. However, its strong dependence on river discharges,…

  • How to Choose a Threshold for an Evaluation Metric for Large Language Models

    How to Choose a Threshold for an Evaluation Metric for Large Language Models arXiv:2412.12148v1 Announce Type: new Abstract: To ensure and monitor large language models (LLMs) reliably, various evaluation metrics have been proposed in the literature. However, there is little research on prescribing a methodology to identify a robust threshold on these metrics even though…

  • Adversarially robust generalization theory via Jacobian regularization for deep neural networks

    Adversarially robust generalization theory via Jacobian regularization for deep neural networks arXiv:2412.12449v1 Announce Type: new Abstract: Powerful deep neural networks are vulnerable to adversarial attacks. To obtain adversarially robust models, researchers have separately developed adversarial training and Jacobian regularization techniques. There are abundant theoretical and empirical studies for adversarial training, but theoretical foundations for Jacobian…

  • Sequential Harmful Shift Detection Without Labels

    Sequential Harmful Shift Detection Without Labels arXiv:2412.12910v1 Announce Type: new Abstract: We introduce a novel approach for detecting distribution shifts that negatively impact the performance of machine learning models in continuous production environments, which requires no access to ground truth data labels. It builds upon the work of Podkopaev and Ramdas [2022], who address scenarios…

  • BOIDS: High-dimensional Bayesian Optimization via Incumbent-guided Direction Lines and Subspace Embeddings

    BOIDS: High-dimensional Bayesian Optimization via Incumbent-guided Direction Lines and Subspace Embeddings arXiv:2412.12918v1 Announce Type: new Abstract: When it comes to expensive black-box optimization problems, Bayesian Optimization (BO) is a well-known and powerful solution. Many real-world applications involve a large number of dimensions, hence scaling BO to high dimension is of much interest. However, state-of-the-art high-dimensional…

  • On Model Extrapolation in Marginal Shapley Values

    On Model Extrapolation in Marginal Shapley Values arXiv:2412.13158v1 Announce Type: new Abstract: As the use of complex machine learning models continues to grow, so does the need for reliable explainability methods. One of the most popular methods for model explainability is based on Shapley values. There are two most commonly used approaches to calculating Shapley…

  • Generative Modeling with Diffusion

    Generative Modeling with Diffusion arXiv:2412.10948v1 Announce Type: new Abstract: We introduce the diffusion model as a method to generate new samples. Generative models have been recently adopted for tasks such as art generation (Stable Diffusion, Dall-E) and text generation (ChatGPT). Diffusion models in particular apply noise to sample data and then “reverse” this noising process…

  • Representation learning of dynamic networks

    Representation learning of dynamic networks arXiv:2412.11065v1 Announce Type: new Abstract: This study presents a novel representation learning model tailored for dynamic networks, which describes the continuously evolving relationships among individuals within a population. The problem is encapsulated in the dimension reduction topic of functional data analysis. With dynamic networks represented as matrix-valued functions, our objective…

  • Deep Learning-based Approaches for State Space Models: A Selective Review

    Deep Learning-based Approaches for State Space Models: A Selective Review arXiv:2412.11211v1 Announce Type: new Abstract: State-space models (SSMs) offer a powerful framework for dynamical system analysis, wherein the temporal dynamics of the system are assumed to be captured through the evolution of the latent states, which govern the values of the observations. This paper provides…

  • datadriftR: An R Package for Concept Drift Detection in Predictive Models

    datadriftR: An R Package for Concept Drift Detection in Predictive Models arXiv:2412.11308v1 Announce Type: new Abstract: Predictive models often face performance degradation due to evolving data distributions, a phenomenon known as data drift. Among its forms, concept drift, where the relationship between explanatory variables and the response variable changes, is particularly challenging to detect and…

  • Prediction-Enhanced Monte Carlo: A Machine Learning View on Control Variate

    Prediction-Enhanced Monte Carlo: A Machine Learning View on Control Variate arXiv:2412.11257v1 Announce Type: new Abstract: Despite being an essential tool across engineering and finance, Monte Carlo simulation can be computationally intensive, especially in large-scale, path-dependent problems that hinder straightforward parallelization. A natural alternative is to replace simulation with machine learning or surrogate prediction, though this…

  • Langevin Monte Carlo Beyond Lipschitz Gradient Continuity

    Langevin Monte Carlo Beyond Lipschitz Gradient Continuity arXiv:2412.09698v1 Announce Type: new Abstract: We present a significant advancement in the field of Langevin Monte Carlo (LMC) methods by introducing the Inexact Proximal Langevin Algorithm (IPLA). This novel algorithm broadens the scope of problems that LMC can effectively address while maintaining controlled computational costs. IPLA extends LMC’s…

  • Investigating the Impact of Balancing, Filtering, and Complexity on Predictive Multiplicity: A Data-Centric Perspective

    Investigating the Impact of Balancing, Filtering, and Complexity on Predictive Multiplicity: A Data-Centric Perspective arXiv:2412.09712v1 Announce Type: new Abstract: The Rashomon effect presents a significant challenge in model selection. It occurs when multiple models achieve similar performance on a dataset but produce different predictions, resulting in predictive multiplicity. This is especially problematic in high-stakes environments,…

  • A Statistical Analysis for Supervised Deep Learning with Exponential Families for Intrinsically Low-dimensional Data

    A Statistical Analysis for Supervised Deep Learning with Exponential Families for Intrinsically Low-dimensional Data arXiv:2412.09779v1 Announce Type: new Abstract: Recent advances have revealed that the rate of convergence of the expected test error in deep supervised learning decays as a function of the intrinsic dimension and not the dimension $d$ of the input space. Existing…

  • DQA: An Efficient Method for Deep Quantization of Deep Neural Network Activations

    DQA: An Efficient Method for Deep Quantization of Deep Neural Network Activations arXiv:2412.09687v1 Announce Type: cross Abstract: Quantization of Deep Neural Network (DNN) activations is a commonly used technique to reduce compute and memory demands during DNN inference, which can be particularly beneficial on resource-constrained devices. To achieve high accuracy, existing methods for quantizing activations…

  • Matrix Completion via Residual Spectral Matching

    Matrix Completion via Residual Spectral Matching arXiv:2412.10005v1 Announce Type: new Abstract: Noisy matrix completion has attracted significant attention due to its applications in recommendation systems, signal processing and image restoration. Most existing works rely on (weighted) least squares methods under various low-rank constraints. However, minimizing the sum of squared residuals is not always efficient, as…

  • GeoConformal prediction: a model-agnostic framework of measuring the uncertainty of spatial prediction

    GeoConformal prediction: a model-agnostic framework of measuring the uncertainty of spatial prediction arXiv:2412.08661v1 Announce Type: new Abstract: Spatial prediction is a fundamental task in geography. In recent years, with advances in geospatial artificial intelligence (GeoAI), numerous models have been developed to improve the accuracy of geographic variable predictions. Beyond achieving higher accuracy, it is equally…

  • On the Precise Asymptotics and Refined Regret of the Variance-Aware UCB Algorithm

    On the Precise Asymptotics and Refined Regret of the Variance-Aware UCB Algorithm arXiv:2412.08843v1 Announce Type: new Abstract: In this paper, we study the behavior of the Upper Confidence Bound-Variance (UCB-V) algorithm for Multi-Armed Bandit (MAB) problems, a variant of the canonical Upper Confidence Bound (UCB) algorithm that incorporates variance estimates into its decision-making process. More…

  • $(epsilon, delta)$-Differentially Private Partial Least Squares Regression

    $(epsilon, delta)$-Differentially Private Partial Least Squares Regression arXiv:2412.09164v1 Announce Type: new Abstract: As data-privacy requirements are becoming increasingly stringent and statistical models based on sensitive data are being deployed and used more routinely, protecting data-privacy becomes pivotal. Partial Least Squares (PLS) regression is the premier tool for building such models in analytical chemistry, yet it…

  • Belted and Ensembled Neural Network for Linear and Nonlinear Sufficient Dimension Reduction

    Belted and Ensembled Neural Network for Linear and Nonlinear Sufficient Dimension Reduction arXiv:2412.08961v1 Announce Type: new Abstract: We introduce a unified, flexible, and easy-to-implement framework of sufficient dimension reduction that can accommodate both linear and nonlinear dimension reduction, and both the conditional distribution and the conditional mean as the targets of estimation. This unified framework…

  • Distribution free uncertainty quantification in neuroscience-inspired deep operators

    Distribution free uncertainty quantification in neuroscience-inspired deep operators arXiv:2412.09369v1 Announce Type: new Abstract: Energy-efficient deep learning algorithms are essential for a sustainable future and feasible edge computing setups. Spiking neural networks (SNNs), inspired from neuroscience, are a positive step in the direction of achieving the required energy efficiency. However, in a bid to lower the…

  • Score-Optimal Diffusion Schedules

    Score-Optimal Diffusion Schedules arXiv:2412.07877v1 Announce Type: new Abstract: Denoising diffusion models (DDMs) offer a flexible framework for sampling from high dimensional data distributions. DDMs generate a path of probability distributions interpolating between a reference Gaussian distribution and a data distribution by incrementally injecting noise into the data. To numerically simulate the sampling process, a discretisation…

  • Low-Rank Correction for Quantized LLMs

    Low-Rank Correction for Quantized LLMs arXiv:2412.07902v1 Announce Type: new Abstract: We consider the problem of model compression for Large Language Models (LLMs) at post-training time, where the task is to compress a well-trained model using only a small set of calibration input data. In this work, we introduce a new low-rank approach to correct for…

  • An Optimistic Algorithm for Online Convex Optimization with Adversarial Constraints

    An Optimistic Algorithm for Online Convex Optimization with Adversarial Constraints arXiv:2412.08060v1 Announce Type: new Abstract: We study Online Convex Optimization (OCO) with adversarial constraints, where an online algorithm must make repeated decisions to minimize both convex loss functions and cumulative constraint violations. We focus on a setting where the algorithm has access to predictions of…

  • Phase-aware Training Schedule Simplifies Learning in Flow-Based Generative Models

    Phase-aware Training Schedule Simplifies Learning in Flow-Based Generative Models arXiv:2412.07972v1 Announce Type: cross Abstract: We analyze the training of a two-layer autoencoder used to parameterize a flow-based generative model for sampling from a high-dimensional Gaussian mixture. Previous work shows that the phase where the relative probability between the modes is learned disappears as the dimension…

  • Spectral Differential Network Analysis for High-Dimensional Time Series

    Spectral Differential Network Analysis for High-Dimensional Time Series arXiv:2412.07905v1 Announce Type: cross Abstract: Spectral networks derived from multivariate time series data arise in many domains, from brain science to Earth science. Often, it is of interest to study how these networks change under different conditions. For instance, to better understand epilepsy, it would be interesting…

  • Generalized Least Squares Kernelized Tensor Factorization

    Generalized Least Squares Kernelized Tensor Factorization arXiv:2412.07041v1 Announce Type: new Abstract: Real-world datasets often contain missing or corrupted values. Completing multidimensional tensor-structured data with missing entries is essential for numerous applications. Smoothness-constrained low-rank factorization models have shown superior performance with reduced computational costs. While effective at capturing global and long-range correlations, these models struggle to…

  • Sequential Controlled Langevin Diffusions

    Sequential Controlled Langevin Diffusions arXiv:2412.07081v1 Announce Type: new Abstract: An effective approach for sampling from unnormalized densities is based on the idea of gradually transporting samples from an easy prior to the complicated target distribution. Two popular methods are (1) Sequential Monte Carlo (SMC), where the transport is performed through successive annealed densities via prescribed…

  • A Note on Sample Complexity of Interactive Imitation Learning with Log Loss

    A Note on Sample Complexity of Interactive Imitation Learning with Log Loss arXiv:2412.07057v1 Announce Type: new Abstract: Imitation learning (IL) is a general paradigm for learning from experts in sequential decision-making problems. Recent advancements in IL have shown that offline imitation learning, specifically Behavior Cloning (BC) with log loss, is minimax optimal. Meanwhile, its interactive…

  • Optimization Can Learn Johnson Lindenstrauss Embeddings

    Optimization Can Learn Johnson Lindenstrauss Embeddings arXiv:2412.07242v1 Announce Type: new Abstract: Embeddings play a pivotal role across various disciplines, offering compact representations of complex data structures. Randomized methods like Johnson-Lindenstrauss (JL) provide state-of-the-art and essentially unimprovable theoretical guarantees for achieving such representations. These guarantees are worst-case and in particular, neither the analysis, nor the algorithm,…

  • Modeling High-Resolution Spatio-Temporal Wind with Deep Echo State Networks and Stochastic Partial Differential Equations

    Modeling High-Resolution Spatio-Temporal Wind with Deep Echo State Networks and Stochastic Partial Differential Equations arXiv:2412.07265v1 Announce Type: new Abstract: In the past decades, clean and renewable energy has gained increasing attention due to a global effort on carbon footprint reduction. In particular, Saudi Arabia is gradually shifting its energy portfolio from an exclusive use of…

  • Ranking of Large Language Model with Nonparametric Prompts

    Ranking of Large Language Model with Nonparametric Prompts arXiv:2412.05506v1 Announce Type: new Abstract: We consider the inference for the ranking of large language models (LLMs). Alignment arises as a big challenge to mitigate hallucinations in the use of LLMs. Ranking LLMs has been shown as a well-performing tool to improve alignment based on the best-of-$N$…

  • Training-Free Bayesianization for Low-Rank Adapters of Large Language Models

    Training-Free Bayesianization for Low-Rank Adapters of Large Language Models arXiv:2412.05723v1 Announce Type: new Abstract: Estimating the uncertainty of responses of Large Language Models~(LLMs) remains a critical challenge. While recent Bayesian methods have demonstrated effectiveness in quantifying uncertainty through low-rank weight updates, they typically require complex fine-tuning or post-training procedures. In this paper, we propose Training-Free…

  • Proximal Iteration for Nonlinear Adaptive Lasso

    Proximal Iteration for Nonlinear Adaptive Lasso arXiv:2412.05726v1 Announce Type: new Abstract: Augmenting a smooth cost function with an $ell_1$ penalty allows analysts to efficiently conduct estimation and variable selection simultaneously in sophisticated models and can be efficiently implemented using proximal gradient methods. However, one drawback of the $ell_1$ penalty is bias: nonzero parameters are underestimated…

  • Leveraging Black-box Models to Assess Feature Importance in Unconditional Distribution

    Leveraging Black-box Models to Assess Feature Importance in Unconditional Distribution arXiv:2412.05759v1 Announce Type: new Abstract: Understanding how changes in explanatory features affect the unconditional distribution of the outcome is important in many applications. However, existing black-box predictive models are not readily suited for analyzing such questions. In this work, we develop an approximation method to…

  • Reinforcement Learning for a Discrete-Time Linear-Quadratic Control Problem with an Application

    Reinforcement Learning for a Discrete-Time Linear-Quadratic Control Problem with an Application arXiv:2412.05906v1 Announce Type: new Abstract: We study the discrete-time linear-quadratic (LQ) control model using reinforcement learning (RL). Using entropy to measure the cost of exploration, we prove that the optimal feedback policy for the problem must be Gaussian type. Then, we apply the results…

  • The Polynomial Stein Discrepancy for Assessing Moment Convergence

    The Polynomial Stein Discrepancy for Assessing Moment Convergence arXiv:2412.05135v1 Announce Type: new Abstract: We propose a novel method for measuring the discrepancy between a set of samples and a desired posterior distribution for Bayesian inference. Classical methods for assessing sample quality like the effective sample size are not appropriate for scalable Bayesian sampling algorithms, such…

  • Semiparametric Bayesian Difference-in-Differences

    Semiparametric Bayesian Difference-in-Differences arXiv:2412.04605v1 Announce Type: cross Abstract: This paper studies semiparametric Bayesian inference for the average treatment effect on the treated (ATT) within the difference-in-differences research design. We propose two new Bayesian methods with frequentist validity. The first one places a standard Gaussian process prior on the conditional mean function of the control group.…

  • Disentangled Representation Learning for Causal Inference with Instruments

    Disentangled Representation Learning for Causal Inference with Instruments arXiv:2412.04641v1 Announce Type: cross Abstract: Latent confounders are a fundamental challenge for inferring causal effects from observational data. The instrumental variable (IV) approach is a practical way to address this challenge. Existing IV based estimators need a known IV or other strong assumptions, such as the existence…

  • Generalized Recorrupted-to-Recorrupted: Self-Supervised Learning Beyond Gaussian Noise

    Generalized Recorrupted-to-Recorrupted: Self-Supervised Learning Beyond Gaussian Noise arXiv:2412.04648v1 Announce Type: cross Abstract: Recorrupted-to-Recorrupted (R2R) has emerged as a methodology for training deep networks for image restoration in a self-supervised manner from noisy measurement data alone, demonstrating equivalence in expectation to the supervised squared loss in the case of Gaussian noise. However, its effectiveness with non-Gaussian…

  • Modeling High-Dimensional Dependent Data in the Presence of Many Explanatory Variables and Weak Signals

    Modeling High-Dimensional Dependent Data in the Presence of Many Explanatory Variables and Weak Signals arXiv:2412.04736v1 Announce Type: cross Abstract: This article considers a novel and widely applicable approach to modeling high-dimensional dependent data when a large number of explanatory variables are available and the signal-to-noise ratio is low. We postulate that a $p$-dimensional response series…

  • Asymptotics of Linear Regression with Linearly Dependent Data

    Asymptotics of Linear Regression with Linearly Dependent Data arXiv:2412.03702v1 Announce Type: new Abstract: In this paper we study the asymptotics of linear regression in settings where the covariates exhibit a linear dependency structure, departing from the standard assumption of independence. We model the covariates using stochastic processes with spatio-temporal covariance and analyze the performance of…

  • Community Detection with Heterogeneous Block Covariance Model

    Community Detection with Heterogeneous Block Covariance Model arXiv:2412.03780v1 Announce Type: new Abstract: Community detection is the task of clustering objects based on their pairwise relationships. Most of the model-based community detection methods, such as the stochastic block model and its variants, are designed for networks with binary (yes/no) edges. In many practical scenarios, edges often…

  • Learning Networks from Wide-Sense Stationary Stochastic Processes

    Learning Networks from Wide-Sense Stationary Stochastic Processes arXiv:2412.03768v1 Announce Type: new Abstract: Complex networked systems driven by latent inputs are common in fields like neuroscience, finance, and engineering. A key inference problem here is to learn edge connectivity from node outputs (potentials). We focus on systems governed by steady-state linear conservation laws: $X_t = {L^{ast}}Y_{t}$,…

  • How well behaved is finite dimensional Diffusion Maps?

    How well behaved is finite dimensional Diffusion Maps? arXiv:2412.03992v1 Announce Type: new Abstract: Under a set of assumptions on a family of submanifolds $subset {mathbb R}^D$, we derive a series of geometric properties that remain valid after finite-dimensional and almost isometric Diffusion Maps (DM), including almost uniform density, finite polynomial approximation and local reach. Leveraging…

  • Pathwise optimization for bridge-type estimators and its applications

    Pathwise optimization for bridge-type estimators and its applications arXiv:2412.04047v1 Announce Type: new Abstract: Sparse parametric models are of great interest in statistical learning and are often analyzed by means of regularized estimators. Pathwise methods allow to efficiently compute the full solution path for penalized estimators, for any possible value of the penalization parameter $lambda$. In…

  • Universal Rates of Empirical Risk Minimization

    Universal Rates of Empirical Risk Minimization arXiv:2412.02810v1 Announce Type: new Abstract: The well-known empirical risk minimization (ERM) principle is the basis of many widely used machine learning algorithms, and plays an essential role in the classical PAC theory. A common description of a learning algorithm’s performance is its so-called “learning curve”, that is, the decay…

  • An Information-Theoretic Analysis of Thompson Sampling for Logistic Bandits

    An Information-Theoretic Analysis of Thompson Sampling for Logistic Bandits arXiv:2412.02861v1 Announce Type: new Abstract: We study the performance of the Thompson Sampling algorithm for logistic bandit problems, where the agent receives binary rewards with probabilities determined by a logistic function $exp(beta langle a, theta rangle)/(1+exp(beta langle a, theta rangle))$. We focus on the setting where…

  • Preference-based Pure Exploration

    Preference-based Pure Exploration arXiv:2412.02988v1 Announce Type: new Abstract: We study the preference-based pure exploration problem for bandits with vector-valued rewards. The rewards are ordered using a (given) preference cone $mathcal{C}$ and our the goal is to identify the set of Pareto optimal arms. First, to quantify the impact of preferences, we derive a novel lower…

  • Generalized Diffusion Model with Adjusted Offset Noise

    Generalized Diffusion Model with Adjusted Offset Noise arXiv:2412.03134v1 Announce Type: new Abstract: Diffusion models have become fundamental tools for modeling data distributions in machine learning and have applications in image generation, drug discovery, and audio synthesis. Despite their success, these models face challenges when generating data with extreme brightness values, as evidenced by limitations in…

  • Nonparametric Filtering, Estimation and Classification using Neural Jump ODEs

    Nonparametric Filtering, Estimation and Classification using Neural Jump ODEs arXiv:2412.03271v1 Announce Type: new Abstract: Neural Jump ODEs model the conditional expectation between observations by neural ODEs and jump at arrival of new observations. They have demonstrated effectiveness for fully data-driven online forecasting in settings with irregular and partial observations, operating under weak regularity assumptions. This…

  • MEP-Net: Generating Solutions to Scientific Problems with Limited Knowledge by Maximum Entropy Principle

    MEP-Net: Generating Solutions to Scientific Problems with Limited Knowledge by Maximum Entropy Principle arXiv:2412.02090v1 Announce Type: new Abstract: Maximum entropy principle (MEP) offers an effective and unbiased approach to inferring unknown probability distributions when faced with incomplete information, while neural networks provide the flexibility to learn complex distributions from data. This paper proposes a novel…

  • Selective Reviews of Bandit Problems in AI via a Statistical View

    Selective Reviews of Bandit Problems in AI via a Statistical View arXiv:2412.02251v1 Announce Type: new Abstract: Reinforcement Learning (RL) is a widely researched area in artificial intelligence that focuses on teaching agents decision-making through interactions with their environment. A key subset includes stochastic multi-armed bandit (MAB) and continuum-armed bandit (SCAB) problems, which model sequential decision-making…

  • The Broader Landscape of Robustness in Algorithmic Statistics

    The Broader Landscape of Robustness in Algorithmic Statistics arXiv:2412.02670v1 Announce Type: new Abstract: The last decade has seen a number of advances in computationally efficient algorithms for statistical methods subject to robustness constraints. An estimator may be robust in a number of different ways: to contamination of the dataset, to heavy-tailed data, or in the…

  • Deep Matrix Factorization with Adaptive Weights for Multi-View Clustering

    Deep Matrix Factorization with Adaptive Weights for Multi-View Clustering arXiv:2412.02292v1 Announce Type: new Abstract: Recently, deep matrix factorization has been established as a powerful model for unsupervised tasks, achieving promising results, especially for multi-view clustering. However, existing methods often lack effective feature selection mechanisms and rely on empirical hyperparameter selection. To address these issues, we…

  • Composition of Experts: A Modular Compound AI System Leveraging Large Language Models

    Composition of Experts: A Modular Compound AI System Leveraging Large Language Models arXiv:2412.01868v1 Announce Type: cross Abstract: Large Language Models (LLMs) have achieved remarkable advancements, but their monolithic nature presents challenges in terms of scalability, cost, and customization. This paper introduces the Composition of Experts (CoE), a modular compound AI system leveraging multiple expert LLMs.…

  • Nonlinearity and Uncertainty Informed Moment-Matching Gaussian Mixture Splitting

    Nonlinearity and Uncertainty Informed Moment-Matching Gaussian Mixture Splitting arXiv:2412.00343v1 Announce Type: new Abstract: Many problems in navigation and tracking require increasingly accurate characterizations of the evolution of uncertainty in nonlinear systems. Nonlinear uncertainty propagation approaches based on Gaussian mixture density approximations offer distinct advantages over sampling based methods in their computational cost and continuous representation.…

  • Optimal Particle-based Approximation of Discrete Distributions (OPAD)

    Optimal Particle-based Approximation of Discrete Distributions (OPAD) arXiv:2412.00545v1 Announce Type: new Abstract: Particle-based methods include a variety of techniques, such as Markov Chain Monte Carlo (MCMC) and Sequential Monte Carlo (SMC), for approximating a probabilistic target distribution with a set of weighted particles. In this paper, we prove that for any set of particles, there…

  • Explicit and data-Efficient Encoding via Gradient Flow

    Explicit and data-Efficient Encoding via Gradient Flow arXiv:2412.00864v1 Announce Type: new Abstract: The autoencoder model typically uses an encoder to map data to a lower dimensional latent space and a decoder to reconstruct it. However, relying on an encoder for inversion can lead to suboptimal representations, particularly limiting in physical sciences where precision is key.…

  • A Note on Estimation Error Bound and Grouping Effect of Transfer Elastic Net

    A Note on Estimation Error Bound and Grouping Effect of Transfer Elastic Net arXiv:2412.01010v1 Announce Type: new Abstract: The Transfer Elastic Net is an estimation method for linear regression models that combines $ell_1$ and $ell_2$ norm penalties to facilitate knowledge transfer. In this study, we derive a non-asymptotic $ell_2$ norm estimation error bound for the…

  • Energy-Based Modelling for Discrete and Mixed Data via Heat Equations on Structured Spaces

    Energy-Based Modelling for Discrete and Mixed Data via Heat Equations on Structured Spaces arXiv:2412.01019v1 Announce Type: new Abstract: Energy-based models (EBMs) offer a flexible framework for probabilistic modelling across various data domains. However, training EBMs on data in discrete or mixed state spaces poses significant challenges due to the lack of robust and fast sampling…

  • The Return of Pseudosciences in Artificial Intelligence: Have Machine Learning and Deep Learning Forgotten Lessons from Statistics and History?

    The Return of Pseudosciences in Artificial Intelligence: Have Machine Learning and Deep Learning Forgotten Lessons from Statistics and History? arXiv:2411.18656v1 Announce Type: new Abstract: In today’s world, AI programs powered by Machine Learning are ubiquitous, and have achieved seemingly exceptional performance across a broad range of tasks, from medical diagnosis and credit rating in banking,…

  • Graph Max Shift: A Hill-Climbing Method for Graph Clustering

    Graph Max Shift: A Hill-Climbing Method for Graph Clustering arXiv:2411.18794v1 Announce Type: new Abstract: We present a method for graph clustering that is analogous with gradient ascent methods previously proposed for clustering points in space. We show that, when applied to a random geometric graph with data iid from some density with Morse regularity, the…

  • Intrinsic Wrapped Gaussian Process Regression Modeling for Manifold-valued Response Variable

    Intrinsic Wrapped Gaussian Process Regression Modeling for Manifold-valued Response Variable arXiv:2411.18989v1 Announce Type: new Abstract: In this paper, we propose a novel intrinsic wrapped Gaussian process regression model for response variable measured on Riemannian manifold. We apply the parallel transport operator to define an intrinsic covariance structure addressing a critical aspect of constructing a well…

  • ABROCA Distributions For Algorithmic Bias Assessment: Considerations Around Interpretation

    ABROCA Distributions For Algorithmic Bias Assessment: Considerations Around Interpretation arXiv:2411.19090v1 Announce Type: new Abstract: Algorithmic bias continues to be a key concern of learning analytics. We study the statistical properties of the Absolute Between-ROC Area (ABROCA) metric. This fairness measure quantifies group-level differences in classifier performance through the absolute difference in ROC curves. ABROCA is…

  • Contrastive representations of high-dimensional, structured treatments

    Contrastive representations of high-dimensional, structured treatments arXiv:2411.19245v1 Announce Type: new Abstract: Estimating causal effects is vital for decision making. In standard causal effect estimation, treatments are usually binary- or continuous-valued. However, in many important real-world settings, treatments can be structured, high-dimensional objects, such as text, video, or audio. This provides a challenge to traditional causal…

  • On the ERM Principle in Meta-Learning

    On the ERM Principle in Meta-Learning arXiv:2411.17898v1 Announce Type: new Abstract: Classic supervised learning involves algorithms trained on $n$ labeled examples to produce a hypothesis $h in mathcal{H}$ aimed at performing well on unseen examples. Meta-learning extends this by training across $n$ tasks, with $m$ examples per task, producing a hypothesis class $mathcal{H}$ within some…

  • Isometry pursuit

    Isometry pursuit arXiv:2411.18502v1 Announce Type: new Abstract: Isometry pursuit is a convex algorithm for identifying orthonormal column-submatrices of wide matrices. It consists of a novel normalization method followed by multitask basis pursuit. Applied to Jacobians of putative coordinate functions, it helps identity isometric embeddings from within interpretable dictionaries. We provide theoretical and experimental results justifying…

  • A Flexible Defense Against the Winner’s Curse

    A Flexible Defense Against the Winner’s Curse arXiv:2411.18569v1 Announce Type: new Abstract: Across science and policy, decision-makers often need to draw conclusions about the best candidate among competing alternatives. For instance, researchers may seek to infer the effectiveness of the most successful treatment or determine which demographic group benefits most from a specific treatment. Similarly,…

  • Functional relevance based on the continuous Shapley value

    Functional relevance based on the continuous Shapley value arXiv:2411.18575v1 Announce Type: new Abstract: The presence of Artificial Intelligence (AI) in our society is increasing, which brings with it the need to understand the behaviour of AI mechanisms, including machine learning predictive algorithms fed with tabular data, text, or images, among other types of data. This…

  • When Is Heterogeneity Actionable for Personalization?

    When Is Heterogeneity Actionable for Personalization? arXiv:2411.16552v1 Announce Type: cross Abstract: Targeting and personalization policies can be used to improve outcomes beyond the uniform policy that assigns the best performing treatment in an A/B test to everyone. Personalization relies on the presence of heterogeneity of treatment effects, yet, as we show in this paper, heterogeneity…

  • Conformalised Conditional Normalising Flows for Joint Prediction Regions in time series

    Conformalised Conditional Normalising Flows for Joint Prediction Regions in time series arXiv:2411.17042v1 Announce Type: new Abstract: Conformal Prediction offers a powerful framework for quantifying uncertainty in machine learning models, enabling the construction of prediction sets with finite-sample validity guarantees. While easily adaptable to non-probabilistic models, applying conformal prediction to probabilistic generative models, such as Normalising…

  • Fast, Precise Thompson Sampling for Bayesian Optimization

    Fast, Precise Thompson Sampling for Bayesian Optimization arXiv:2411.17071v1 Announce Type: new Abstract: Thompson sampling (TS) has optimal regret and excellent empirical performance in multi-armed bandit problems. Yet, in Bayesian optimization, TS underperforms popular acquisition functions (e.g., EI, UCB). TS samples arms according to the probability that they are optimal. A recent algorithm, P-Star Sampler (PSS),…

  • Spatio-Temporal Conformal Prediction for Power Outage Data

    Spatio-Temporal Conformal Prediction for Power Outage Data arXiv:2411.17099v1 Announce Type: new Abstract: In recent years, increasingly unpredictable and severe global weather patterns have frequently caused long-lasting power outages. Building resilience, the ability to withstand, adapt to, and recover from major disruptions, has become crucial for the power industry. To enable rapid recovery, accurately predicting future…

  • Training a neural netwok for data reduction and better generalization

    Training a neural netwok for data reduction and better generalization arXiv:2411.17180v1 Announce Type: new Abstract: The motivation for sparse learners is to compress the inputs (features) by selecting only the ones needed for good generalization. Linear models with LASSO-type regularization achieve this by setting the weights of irrelevant features to zero, effectively identifying and ignoring…

  • A Generalized Unified Skew-Normal Process with Neural Bayes Inference

    A Generalized Unified Skew-Normal Process with Neural Bayes Inference arXiv:2411.17400v1 Announce Type: new Abstract: In recent decades, statisticians have been increasingly encountering spatial data that exhibit non-Gaussian behaviors such as asymmetry and heavy-tailedness. As a result, the assumptions of symmetry and fixed tail weight in Gaussian processes have become restrictive and may fail to capture…