Category: cs.LG
-
StealthRank: LLM Ranking Manipulation via Stealthy Prompt Optimization
StealthRank: LLM Ranking Manipulation via Stealthy Prompt Optimization arXiv:2504.05804v1 Announce Type: cross Abstract: The integration of large language models (LLMs) into information retrieval systems introduces new attack surfaces, particularly for adversarial ranking manipulations. We present StealthRank, a novel adversarial ranking attack that manipulates LLM-driven product recommendation systems while maintaining textual fluency and stealth. Unlike existing…
-
Hyperflows: Pruning Reveals the Importance of Weights
Hyperflows: Pruning Reveals the Importance of Weights arXiv:2504.05349v1 Announce Type: new Abstract: Network pruning is used to reduce inference latency and power consumption in large neural networks. However, most existing methods struggle to accurately assess the importance of individual weights due to their inherent interrelatedness, leading to poor performance, especially at extreme sparsity levels. We…
-
Survey on Algorithms for multi-index models
Survey on Algorithms for multi-index models arXiv:2504.05426v1 Announce Type: new Abstract: We review the literature on algorithms for estimating the index space in a multi-index model. The primary focus is on computationally efficient (polynomial-time) algorithms in Gaussian space, the assumptions under which consistency is guaranteed by these methods, and their sample complexity. In many cases,…
-
Actuarial Learning for Pension Fund Mortality Forecasting
Actuarial Learning for Pension Fund Mortality Forecasting arXiv:2504.05881v1 Announce Type: new Abstract: For the assessment of the financial soundness of a pension fund, it is necessary to take into account mortality forecasting so that longevity risk is consistently incorporated into future cash flows. In this article, we employ machine learning models applied to actuarial science…
-
Improved Inference of Inverse Ising Problems under Missing Observations in Restricted Boltzmann Machines
Improved Inference of Inverse Ising Problems under Missing Observations in Restricted Boltzmann Machines arXiv:2504.05643v1 Announce Type: new Abstract: Restricted Boltzmann machines (RBMs) are energy-based models analogous to the Ising model and are widely applied in statistical machine learning. The standard inverse Ising problem with a complete dataset requires computing both data and model expectations and…
-
Batch Bayesian Optimization for High-Dimensional Experimental Design: Simulation and Visualization
Batch Bayesian Optimization for High-Dimensional Experimental Design: Simulation and Visualization arXiv:2504.03943v1 Announce Type: new Abstract: Bayesian Optimization (BO) is increasingly used to guide experimental optimization tasks. To elucidate BO behavior in noisy and high-dimensional settings typical for materials science applications, we perform batch BO of two six-dimensional test functions: an Ackley function representing a needle-in-a-haystack…
-
Robust Reinforcement Learning from Human Feedback for Large Language Models Fine-Tuning
Robust Reinforcement Learning from Human Feedback for Large Language Models Fine-Tuning arXiv:2504.03784v1 Announce Type: new Abstract: Reinforcement learning from human feedback (RLHF) has emerged as a key technique for aligning the output of large language models (LLMs) with human preferences. To learn the reward function, most existing RLHF algorithms use the Bradley-Terry model, which relies…
-
Spatially-Heterogeneous Causal Bayesian Networks for Seismic Multi-Hazard Estimation: A Variational Approach with Gaussian Processes and Normalizing Flows
Spatially-Heterogeneous Causal Bayesian Networks for Seismic Multi-Hazard Estimation: A Variational Approach with Gaussian Processes and Normalizing Flows arXiv:2504.04013v1 Announce Type: new Abstract: Post-earthquake hazard and impact estimation are critical for effective disaster response, yet current approaches face significant limitations. Traditional models employ fixed parameters regardless of geographical context, misrepresenting how seismic effects vary across diverse…
-
Computational Efficient Informative Nonignorable Matrix Completion: A Row- and Column-Wise Matrix U-Statistic Pseudo-Likelihood Approach
Computational Efficient Informative Nonignorable Matrix Completion: A Row- and Column-Wise Matrix U-Statistic Pseudo-Likelihood Approach arXiv:2504.04016v1 Announce Type: new Abstract: In this study, we establish a unified framework to deal with the high dimensional matrix completion problem under flexible nonignorable missing mechanisms. Although the matrix completion problem has attracted much attention over the years, there are…
-
Minimax Optimal Convergence of Gradient Descent in Logistic Regression via Large and Adaptive Stepsizes
Minimax Optimal Convergence of Gradient Descent in Logistic Regression via Large and Adaptive Stepsizes arXiv:2504.04105v1 Announce Type: new Abstract: We study $textit{gradient descent}$ (GD) for logistic regression on linearly separable data with stepsizes that adapt to the current risk, scaled by a constant hyperparameter $eta$. We show that after at most $1/gamma^2$ burn-in steps, GD…
-
ConfEviSurrogate: A Conformalized Evidential Surrogate Model for Uncertainty Quantification
ConfEviSurrogate: A Conformalized Evidential Surrogate Model for Uncertainty Quantification arXiv:2504.02919v1 Announce Type: new Abstract: Surrogate models, crucial for approximating complex simulation data across sciences, inherently carry uncertainties that range from simulation noise to model prediction errors. Without rigorous uncertainty quantification, predictions become unreliable and hence hinder analysis. While methods like Monte Carlo dropout and ensemble…
-
High-dimensional ridge regression with random features for non-identically distributed data with a variance profile
High-dimensional ridge regression with random features for non-identically distributed data with a variance profile arXiv:2504.03035v1 Announce Type: new Abstract: The behavior of the random feature model in the high-dimensional regression framework has become a popular issue of interest in the machine learning literature}. This model is generally considered for feature vectors $x_i = Sigma^{1/2} x_i’$,…
-
A computational transition for detecting multivariate shuffled linear regression by low-degree polynomials
A computational transition for detecting multivariate shuffled linear regression by low-degree polynomials arXiv:2504.03097v1 Announce Type: new Abstract: In this paper, we study the problem of multivariate shuffled linear regression, where the correspondence between predictors and responses in a linear model is obfuscated by a latent permutation. Specifically, we investigate the model $Y=tfrac{1}{sqrt{1+sigma^2}}(Pi_* X Q_* +…
-
Accelerating Particle-based Energetic Variational Inference
Accelerating Particle-based Energetic Variational Inference arXiv:2504.03158v1 Announce Type: new Abstract: In this work, we propose a novel particle-based variational inference (ParVI) method that accelerates the EVI-Im. Inspired by energy quadratization (EQ) and operator splitting techniques for gradient flows, our approach efficiently drives particles towards the target distribution. Unlike EVI-Im, which employs the implicit Euler method…
-
Bayesian Optimization of Robustness Measures Using Randomized GP-UCB-based Algorithms under Input Uncertainty
Bayesian Optimization of Robustness Measures Using Randomized GP-UCB-based Algorithms under Input Uncertainty arXiv:2504.03172v1 Announce Type: new Abstract: Bayesian optimization based on Gaussian process upper confidence bound (GP-UCB) has a theoretical guarantee for optimizing black-box functions. Black-box functions often have input uncertainty, but even in this case, GP-UCB can be extended to optimize evaluation measures called…
-
Analytical Discovery of Manifold with Machine Learning
Analytical Discovery of Manifold with Machine Learning arXiv:2504.02511v1 Announce Type: new Abstract: Understanding low-dimensional structures within high-dimensional data is crucial for visualization, interpretation, and denoising in complex datasets. Despite the advancements in manifold learning techniques, key challenges-such as limited global insight and the lack of interpretable analytical descriptions-remain unresolved. In this work, we introduce a…
-
Dynamic Assortment Selection and Pricing with Censored Preference Feedback
Dynamic Assortment Selection and Pricing with Censored Preference Feedback arXiv:2504.02324v1 Announce Type: new Abstract: In this study, we investigate the problem of dynamic multi-product selection and pricing by introducing a novel framework based on a textit{censored multinomial logit} (C-MNL) choice model. In this model, sellers present a set of products with prices, and buyers filter…
-
On Model Protection in Federated Learning against Eavesdropping Attacks
On Model Protection in Federated Learning against Eavesdropping Attacks arXiv:2504.02114v1 Announce Type: cross Abstract: In this study, we investigate the protection offered by federated learning algorithms against eavesdropping adversaries. In our model, the adversary is capable of intercepting model updates transmitted from clients to the server, enabling it to create its own estimate of the…
-
Towards Interpretable Soft Prompts
Towards Interpretable Soft Prompts arXiv:2504.02144v1 Announce Type: cross Abstract: Soft prompts have been popularized as a cheap and easy way to improve task-specific LLM performance beyond few-shot prompts. Despite their origin as an automated prompting method, however, soft prompts and other trainable prompts remain a black-box method with no immediately interpretable connections to prompting. We…
-
Fair Sufficient Representation Learning
Fair Sufficient Representation Learning arXiv:2504.01030v1 Announce Type: new Abstract: The main objective of fair statistical modeling and machine learning is to minimize or eliminate biases that may arise from the data or the model itself, ensuring that predictions and decisions are not unjustly influenced by sensitive attributes such as race, gender, age, or other protected…
-
Estimating Unbounded Density Ratios: Applications in Error Control under Covariate Shift
Estimating Unbounded Density Ratios: Applications in Error Control under Covariate Shift arXiv:2504.01031v1 Announce Type: new Abstract: The density ratio is an important metric for evaluating the relative likelihood of two probability distributions, with extensive applications in statistics and machine learning. However, existing estimation theories for density ratios often depend on stringent regularity conditions, mainly focusing…
-
Density estimation via mixture discrepancy and moments
Density estimation via mixture discrepancy and moments arXiv:2504.01570v1 Announce Type: new Abstract: With the aim of generalizing histogram statistics to higher dimensional cases, density estimation via discrepancy based sequential partition (DSP) has been proposed [D. Li, K. Yang, W. Wong, Advances in Neural Information Processing Systems (2016) 1099-1107] to learn an adaptive piecewise constant approximation…
-
Denoising guarantees for optimized sampling schemes in compressed sensing
Denoising guarantees for optimized sampling schemes in compressed sensing arXiv:2504.01046v1 Announce Type: new Abstract: Compressed sensing with subsampled unitary matrices benefits from emph{optimized} sampling schemes, which feature improved theoretical guarantees and empirical performance relative to uniform subsampling. We provide, in a first of its kind in compressed sensing, theoretical guarantees showing that the error caused…
-
Sparse Gaussian Neural Processes
Sparse Gaussian Neural Processes arXiv:2504.01650v1 Announce Type: new Abstract: Despite significant recent advances in probabilistic meta-learning, it is common for practitioners to avoid using deep learning models due to a comparative lack of interpretability. Instead, many practitioners simply use non-meta-models such as Gaussian processes with interpretable priors, and conduct the tedious procedure of training their…
-
Privacy-Preserving Transfer Learning for Community Detection using Locally Distributed Multiple Networks
Privacy-Preserving Transfer Learning for Community Detection using Locally Distributed Multiple Networks arXiv:2504.00890v1 Announce Type: new Abstract: This paper develops a new spectral clustering-based method called TransNet for transfer learning in community detection of network data. Our goal is to improve the clustering performance of the target network using auxiliary source networks, which are heterogeneous, privacy-preserved,…
-
Communication-Efficient l_0 Penalized Least Square
Communication-Efficient l_0 Penalized Least Square arXiv:2504.00722v1 Announce Type: new Abstract: In this paper, we propose a communication-efficient penalized regression algorithm for high-dimensional sparse linear regression models with massive data. This approach incorporates an optimized distributed system communication algorithm, named CESDAR algorithm, based on the Enhanced Support Detection and Root finding algorithm. The CESDAR algorithm leverages…
-
Nuclear Microreactor Control with Deep Reinforcement Learning
Nuclear Microreactor Control with Deep Reinforcement Learning arXiv:2504.00156v1 Announce Type: cross Abstract: The economic feasibility of nuclear microreactors will depend on minimizing operating costs through advancements in autonomous control, especially when these microreactors are operating alongside other types of energy systems (e.g., renewable energy). This study explores the application of deep reinforcement learning (RL) for…
-
Backdoor Detection through Replicated Execution of Outsourced Training
Backdoor Detection through Replicated Execution of Outsourced Training arXiv:2504.00170v1 Announce Type: cross Abstract: It is common practice to outsource the training of machine learning models to cloud providers. Clients who do so gain from the cloud’s economies of scale, but implicitly assume trust: the server should not deviate from the client’s training procedure. A malicious…
-
DGSAM: Domain Generalization via Individual Sharpness-Aware Minimization
DGSAM: Domain Generalization via Individual Sharpness-Aware Minimization arXiv:2503.23430v1 Announce Type: new Abstract: Domain generalization (DG) aims to learn models that can generalize well to unseen domains by training only on a set of source domains. Sharpness-Aware Minimization (SAM) has been a popular approach for this, aiming to find flat minima in the total loss landscape.…
-
Accelerated Stein Variational Gradient Flow
Accelerated Stein Variational Gradient Flow arXiv:2503.23462v1 Announce Type: new Abstract: Stein variational gradient descent (SVGD) is a kernel-based particle method for sampling from a target distribution, e.g., in generative modeling and Bayesian inference. SVGD does not require estimating the gradient of the log-density, which is called score estimation. In practice, SVGD can be slow compared…
-
Scalable Geometric Learning with Correlation-Based Functional Brain Networks
Scalable Geometric Learning with Correlation-Based Functional Brain Networks arXiv:2503.23653v1 Announce Type: new Abstract: The correlation matrix is a central representation of functional brain networks in neuroimaging. Traditional analyses often treat pairwise interactions independently in a Euclidean setting, overlooking the intrinsic geometry of correlation matrices. While earlier attempts have embraced the quotient geometry of the correlation…
-
Learning a Single Index Model from Anisotropic Data with vanilla Stochastic Gradient Descent
Learning a Single Index Model from Anisotropic Data with vanilla Stochastic Gradient Descent arXiv:2503.23642v1 Announce Type: new Abstract: We investigate the problem of learning a Single Index Model (SIM)- a popular model for studying the ability of neural networks to learn features – from anisotropic Gaussian inputs by training a neuron using vanilla Stochastic Gradient…
-
Feature learning from non-Gaussian inputs: the case of Independent Component Analysis in high dimensions
Feature learning from non-Gaussian inputs: the case of Independent Component Analysis in high dimensions arXiv:2503.23896v1 Announce Type: new Abstract: Deep neural networks learn structured features from complex, non-Gaussian inputs, but the mechanisms behind this process remain poorly understood. Our work is motivated by the observation that the first-layer filters learnt by deep convolutional neural networks…
-
Structured and sparse partial least squares coherence for multivariate cortico-muscular analysis
Structured and sparse partial least squares coherence for multivariate cortico-muscular analysis arXiv:2503.21802v1 Announce Type: cross Abstract: Multivariate cortico-muscular analysis has recently emerged as a promising approach for evaluating the corticospinal neural pathway. However, current multivariate approaches encounter challenges such as high dimensionality and limited sample sizes, thus restricting their further applications. In this paper, we…
-
Is Best-of-N the Best of Them? Coverage, Scaling, and Optimality in Inference-Time Alignment
Is Best-of-N the Best of Them? Coverage, Scaling, and Optimality in Inference-Time Alignment arXiv:2503.21878v1 Announce Type: cross Abstract: Inference-time computation provides an important axis for scaling language model performance, but naively scaling compute through techniques like Best-of-$N$ sampling can cause performance to degrade due to reward hacking. Toward a theoretical understanding of how to best…
-
Improving Equivariant Networks with Probabilistic Symmetry Breaking
Improving Equivariant Networks with Probabilistic Symmetry Breaking arXiv:2503.21985v1 Announce Type: cross Abstract: Equivariance encodes known symmetries into neural networks, often enhancing generalization. However, equivariant networks cannot break symmetries: the output of an equivariant network must, by definition, have at least the same self-symmetries as the input. This poses an important problem, both (1) for prediction…
-
Fundamental Safety-Capability Trade-offs in Fine-tuning Large Language Models
Fundamental Safety-Capability Trade-offs in Fine-tuning Large Language Models arXiv:2503.20807v1 Announce Type: new Abstract: Fine-tuning Large Language Models (LLMs) on some task-specific datasets has been a primary use of LLMs. However, it has been empirically observed that this approach to enhancing capability inevitably compromises safety, a phenomenon also known as the safety-capability trade-off in LLM fine-tuning.…
-
Squared families: Searching beyond regular probability models
Squared families: Searching beyond regular probability models arXiv:2503.21128v1 Announce Type: new Abstract: We introduce squared families, which are families of probability densities obtained by squaring a linear transformation of a statistic. Squared families are singular, however their singularity can easily be handled so that they form regular models. After handling the singularity, squared families possess…
-
DeepRV: pre-trained spatial priors for accelerated disease mapping
DeepRV: pre-trained spatial priors for accelerated disease mapping arXiv:2503.21473v1 Announce Type: new Abstract: Recently introduced prior-encoding deep generative models (e.g., PriorVAE, $pi$VAE, and PriorCVAE) have emerged as powerful tools for scalable Bayesian inference by emulating complex stochastic processes like Gaussian processes (GPs). However, these methods remain largely a proof-of-concept and inaccessible to practitioners. We propose…
-
Constraint-based causal discovery with tiered background knowledge and latent variables in single or overlapping datasets
Constraint-based causal discovery with tiered background knowledge and latent variables in single or overlapping datasets arXiv:2503.21526v1 Announce Type: new Abstract: In this paper we consider the use of tiered background knowledge within constraint based causal discovery. Our focus is on settings relaxing causal sufficiency, i.e. allowing for latent variables which may arise because relevant information…
-
A stochastic gradient descent algorithm with random search directions
A stochastic gradient descent algorithm with random search directions arXiv:2503.19942v1 Announce Type: new Abstract: Stochastic coordinate descent algorithms are efficient methods in which each iterate is obtained by fixing most coordinates at their values from the current iteration, and approximately minimizing the objective with respect to the remaining coordinates. However, this approach is usually restricted…
-
On the Robustness of Kernel Ridge Regression Using the Cauchy Loss Function
On the Robustness of Kernel Ridge Regression Using the Cauchy Loss Function arXiv:2503.20120v1 Announce Type: new Abstract: Robust regression aims to develop methods for estimating an unknown regression function in the presence of outliers, heavy-tailed distributions, or contaminated data, which can severely impact performance. Most existing theoretical results in robust regression assume that the noise…
-
Learning Data-Driven Uncertainty Set Partitions for Robust and Adaptive Energy Forecasting with Missing Data
Learning Data-Driven Uncertainty Set Partitions for Robust and Adaptive Energy Forecasting with Missing Data arXiv:2503.20410v1 Announce Type: new Abstract: Short-term forecasting models typically assume the availability of input data (features) when they are deployed and in use. However, equipment failures, disruptions, cyberattacks, may lead to missing features when such models are used operationally, which could…
-
An $(epsilon,delta)$-accurate level set estimation with a stopping criterion
An $(epsilon,delta)$-accurate level set estimation with a stopping criterion arXiv:2503.20272v1 Announce Type: new Abstract: The level set estimation problem seeks to identify regions within a set of candidate points where an unknown and costly to evaluate function’s value exceeds a specified threshold, providing an efficient alternative to exhaustive evaluations of function values. Traditional methods often…
-
Regression-Based Estimation of Causal Effects in the Presence of Selection Bias and Confounding
Regression-Based Estimation of Causal Effects in the Presence of Selection Bias and Confounding arXiv:2503.20546v1 Announce Type: new Abstract: We consider the problem of estimating the expected causal effect $E[Y|do(X)]$ for a target variable $Y$ when treatment $X$ is set by intervention, focusing on continuous random variables. In settings without selection bias or confounding, $E[Y|do(X)] =…
-
CAE: Repurposing the Critic as an Explorer in Deep Reinforcement Learning
CAE: Repurposing the Critic as an Explorer in Deep Reinforcement Learning arXiv:2503.18980v1 Announce Type: new Abstract: Exploration remains a critical challenge in reinforcement learning, as many existing methods either lack theoretical guarantees or fall short of practical effectiveness. In this paper, we introduce CAE, a lightweight algorithm that repurposes the value networks in standard deep…
-
Minimum Volume Conformal Sets for Multivariate Regression
Minimum Volume Conformal Sets for Multivariate Regression arXiv:2503.19068v1 Announce Type: new Abstract: Conformal prediction provides a principled framework for constructing predictive sets with finite-sample validity. While much of the focus has been on univariate response variables, existing multivariate methods either impose rigid geometric assumptions or rely on flexible but computationally expensive approaches that do not…
-
Centroid Decision Forest
Centroid Decision Forest arXiv:2503.19306v1 Announce Type: new Abstract: This paper introduces the centroid decision forest (CDF), a novel ensemble learning framework that redefines the splitting strategy and tree building in the ordinary decision trees for high-dimensional classification. The splitting approach in CDF differs from the traditional decision trees in theat the class separability score (CSS)…
-
Universal Architectures for the Learning of Polyhedral Norms and Convex Regularization Functionals
Universal Architectures for the Learning of Polyhedral Norms and Convex Regularization Functionals arXiv:2503.19190v1 Announce Type: new Abstract: This paper addresses the task of learning convex regularizers to guide the reconstruction of images from limited data. By imposing that the reconstruction be amplitude-equivariant, we narrow down the class of admissible functionals to those that can be…
-
Causal Bayesian Optimization with Unknown Graphs
Causal Bayesian Optimization with Unknown Graphs arXiv:2503.19554v1 Announce Type: new Abstract: Causal Bayesian Optimization (CBO) is a methodology designed to optimize an outcome variable by leveraging known causal relationships through targeted interventions. Traditional CBO methods require a fully and accurately specified causal graph, which is a limitation in many real-world scenarios where such graphs are…
-
Communities in the Kuramoto Model: Dynamics and Detection via Path Signatures
Communities in the Kuramoto Model: Dynamics and Detection via Path Signatures arXiv:2503.17546v1 Announce Type: new Abstract: The behavior of multivariate dynamical processes is often governed by underlying structural connections that relate the components of the system. For example, brain activity which is often measured via time series is determined by an underlying structural graph, where…
-
A Statistical Theory of Contrastive Learning via Approximate Sufficient Statistics
A Statistical Theory of Contrastive Learning via Approximate Sufficient Statistics arXiv:2503.17538v1 Announce Type: new Abstract: Contrastive learning — a modern approach to extract useful representations from unlabeled data by training models to distinguish similar samples from dissimilar ones — has driven significant progress in foundation models. In this work, we develop a new theoretical framework…
-
Poisson-Process Topic Model for Integrating Knowledge from Pre-trained Language Models
Poisson-Process Topic Model for Integrating Knowledge from Pre-trained Language Models arXiv:2503.17809v1 Announce Type: new Abstract: Topic modeling is traditionally applied to word counts without accounting for the context in which words appear. Recent advancements in large language models (LLMs) offer contextualized word embeddings, which capture deeper meaning and relationships between words. We aim to leverage…
-
Understanding Inverse Reinforcement Learning under Overparameterization: Non-Asymptotic Analysis and Global Optimality
Understanding Inverse Reinforcement Learning under Overparameterization: Non-Asymptotic Analysis and Global Optimality arXiv:2503.17865v1 Announce Type: new Abstract: The goal of the Inverse reinforcement learning (IRL) task is to identify the underlying reward function and the corresponding optimal policy from a set of expert demonstrations. While most IRL algorithms’ theoretical guarantees rely on a linear reward structure,…
-
Quantile-Based Randomized Kaczmarz for Corrupted Tensor Linear Systems
Quantile-Based Randomized Kaczmarz for Corrupted Tensor Linear Systems arXiv:2503.18190v1 Announce Type: new Abstract: The reconstruction of tensor-valued signals from corrupted measurements, known as tensor regression, has become essential in many multi-modal applications such as hyperspectral image reconstruction and medical imaging. In this work, we address the tensor linear system problem $mathcal{A} mathcal{X}=mathcal{B}$, where $mathcal{A}$ is…
-
Procrustes Wasserstein Metric: A Modified Benamou-Brenier Approach with Applications to Latent Gaussian Distributions
Procrustes Wasserstein Metric: A Modified Benamou-Brenier Approach with Applications to Latent Gaussian Distributions arXiv:2503.16580v1 Announce Type: new Abstract: We introduce a modified Benamou-Brenier type approach leading to a Wasserstein type distance that allows global invariance, specifically, isometries, and we show that the problem can be summarized to orthogonal transformations. This distance is defined by penalizing…
-
EarlyStopping: Implicit Regularization for Iterative Learning Procedures in Python
EarlyStopping: Implicit Regularization for Iterative Learning Procedures in Python arXiv:2503.16753v1 Announce Type: new Abstract: Iterative learning procedures are ubiquitous in machine learning and modern statistics. Regularision is typically required to prevent inflating the expected loss of a procedure in later iterations via the propagation of noise inherent in the data. Significant emphasis has been placed…
-
Optimal Nonlinear Online Learning under Sequential Price Competition via s-Concavity
Optimal Nonlinear Online Learning under Sequential Price Competition via s-Concavity arXiv:2503.16737v1 Announce Type: new Abstract: We consider price competition among multiple sellers over a selling horizon of $T$ periods. In each period, sellers simultaneously offer their prices and subsequently observe their respective demand that is unobservable to competitors. The demand function for each seller depends…
-
Online Selective Conformal Prediction: Errors and Solutions
Online Selective Conformal Prediction: Errors and Solutions arXiv:2503.16809v1 Announce Type: new Abstract: In online selective conformal inference, data arrives sequentially, and prediction intervals are constructed only when an online selection rule is met. Since online selections may break the exchangeability between the selected test datum and the rest of the data, one must correct for…
-
Sparse Additive Contextual Bandits: A Nonparametric Approach for Online Decision-making with High-dimensional Covariates
Sparse Additive Contextual Bandits: A Nonparametric Approach for Online Decision-making with High-dimensional Covariates arXiv:2503.16941v1 Announce Type: new Abstract: Personalized services are central to today’s digital landscape, where online decision-making is commonly formulated as contextual bandit problems. Two key challenges emerge in modern applications: high-dimensional covariates and the need for nonparametric models to capture complex reward-covariate…
-
Hierarchical clustering with maximum density paths and mixture models
Hierarchical clustering with maximum density paths and mixture models arXiv:2503.15582v1 Announce Type: new Abstract: Hierarchical clustering is an effective and interpretable technique for analyzing structure in data, offering a nuanced understanding by revealing insights at multiple scales and resolutions. It is particularly helpful in settings where the exact number of clusters is unknown, and provides…
-
Interpretable Neural Causal Models with TRAM-DAGs
Interpretable Neural Causal Models with TRAM-DAGs arXiv:2503.16206v1 Announce Type: new Abstract: The ultimate goal of most scientific studies is to understand the underlying causal mechanism between the involved variables. Structural causal models (SCMs) are widely used to represent such causal mechanisms. Given an SCM, causal queries on all three levels of Pearl’s causal hierarchy can…
-
Tuning Sequential Monte Carlo Samplers via Greedy Incremental Divergence Minimization
Tuning Sequential Monte Carlo Samplers via Greedy Incremental Divergence Minimization arXiv:2503.15704v1 Announce Type: new Abstract: The performance of sequential Monte Carlo (SMC) samplers heavily depends on the tuning of the Markov kernels used in the path proposal. For SMC samplers with unadjusted Markov kernels, standard tuning objectives, such as the Metropolis-Hastings acceptance rate or the…
-
Sparse Nonparametric Contextual Bandits
Sparse Nonparametric Contextual Bandits arXiv:2503.16382v1 Announce Type: new Abstract: This paper studies the problem of simultaneously learning relevant features and minimising regret in contextual bandit problems. We introduce and analyse a new class of contextual bandit problems, called sparse nonparametric contextual bandits, in which the expected reward function lies in the linear span of a…
-
Data-Driven Approximation of Binary-State Network Reliability Function: Algorithm Selection and Reliability Thresholds for Large-Scale Systems
Data-Driven Approximation of Binary-State Network Reliability Function: Algorithm Selection and Reliability Thresholds for Large-Scale Systems arXiv:2503.15545v1 Announce Type: cross Abstract: Network reliability assessment is pivotal for ensuring the robustness of modern infrastructure systems, from power grids to communication networks. While exact reliability computation for binary-state networks is NP-hard, existing approximation methods face critical tradeoffs between…
-
Variational Autoencoded Multivariate Spatial Fay-Herriot Models
Variational Autoencoded Multivariate Spatial Fay-Herriot Models arXiv:2503.14710v1 Announce Type: new Abstract: Small area estimation models are essential for estimating population characteristics in regions with limited sample sizes, thereby supporting policy decisions, demographic studies, and resource allocation, among other use cases. The spatial Fay-Herriot model is one such approach that incorporates spatial dependence to improve estimation…
-
The Hardness of Validating Observational Studies with Experimental Data
The Hardness of Validating Observational Studies with Experimental Data arXiv:2503.14795v1 Announce Type: new Abstract: Observational data is often readily available in large quantities, but can lead to biased causal effect estimates due to the presence of unobserved confounding. Recent works attempt to remove this bias by supplementing observational data with experimental data, which, when available,…
-
Interpretability of Graph Neural Networks to Assert Effects of Global Change Drivers on Ecological Networks
Interpretability of Graph Neural Networks to Assert Effects of Global Change Drivers on Ecological Networks arXiv:2503.15107v1 Announce Type: new Abstract: Pollinators play a crucial role for plant reproduction, either in natural ecosystem or in human-modified landscape. Global change drivers,including climate change or land use modifications, can alter the plant-pollinator interactions. To assert the potential influence…
-
Online federated learning framework for classification
Online federated learning framework for classification arXiv:2503.15210v1 Announce Type: new Abstract: In this paper, we develop a novel online federated learning framework for classification, designed to handle streaming data from multiple clients while ensuring data privacy and computational efficiency. Our method leverages the generalized distance-weighted discriminant technique, making it robust to both homogeneous and heterogeneous…
-
Positivity sets of hinge functions
Positivity sets of hinge functions arXiv:2503.13512v1 Announce Type: new Abstract: In this paper we investigate which subsets of the real plane are realisable as the set of points on which a one-layer ReLU neural network takes a positive value. In the case of cones we give a full characterisation of such sets. Furthermore, we give…
-
Micro Text Classification Based on Balanced Positive-Unlabeled Learning
Micro Text Classification Based on Balanced Positive-Unlabeled Learning arXiv:2503.13562v1 Announce Type: new Abstract: In real-world text classification tasks, negative texts often contain a minimal proportion of negative content, which is especially problematic in areas like text quality control, legal risk screening, and sensitive information interception. This challenge manifests at two levels: at the macro level,…
-
Bayesian Kernel Regression for Functional Data
Bayesian Kernel Regression for Functional Data arXiv:2503.13676v1 Announce Type: new Abstract: In supervised learning, the output variable to be predicted is often represented as a function, such as a spectrum or probability distribution. Despite its importance, functional output regression remains relatively unexplored. In this study, we propose a novel functional output regression model based on…
-
ROCK: A variational formulation for occupation kernel methods in Reproducing Kernel Hilbert Spaces
ROCK: A variational formulation for occupation kernel methods in Reproducing Kernel Hilbert Spaces arXiv:2503.13791v1 Announce Type: new Abstract: We present a Representer Theorem result for a large class of weak formulation problems. We provide examples of applications of our formulation both in traditional machine learning and numerical methods as well as in new and emerging…
-
Ranking and Selection with Simultaneous Input Data Collection
Ranking and Selection with Simultaneous Input Data Collection arXiv:2503.11773v1 Announce Type: new Abstract: In this paper, we propose a general and novel formulation of ranking and selection with the existence of streaming input data. The collection of multiple streams of such data may consume different types of resources, and hence can be conducted simultaneously. To…
-
Bayes and Biased Estimators Without Hyper-parameter Estimation: Comparable Performance to the Empirical-Bayes-Based Regularized Estimator
Bayes and Biased Estimators Without Hyper-parameter Estimation: Comparable Performance to the Empirical-Bayes-Based Regularized Estimator arXiv:2503.11854v1 Announce Type: new Abstract: Regularized system identification has become a significant complement to more classical system identification. It has been numerically shown that kernel-based regularized estimators often perform better than the maximum likelihood estimator in terms of minimizing mean squared…
-
Support Collapse of Deep Gaussian Processes with Polynomial Kernels for a Wide Regime of Hyperparameters
Support Collapse of Deep Gaussian Processes with Polynomial Kernels for a Wide Regime of Hyperparameters arXiv:2503.12266v1 Announce Type: new Abstract: We analyze the prior that a Deep Gaussian Process with polynomial kernels induces. We observe that, even for relatively small depths, averaging effects occur within such a Deep Gaussian Process and that the prior can…
-
SNPL: Simultaneous Policy Learning and Evaluation for Safe Multi-Objective Policy Improvement
SNPL: Simultaneous Policy Learning and Evaluation for Safe Multi-Objective Policy Improvement arXiv:2503.12760v1 Announce Type: new Abstract: To design effective digital interventions, experimenters face the challenge of learning decision policies that balance multiple objectives using offline data. Often, they aim to develop policies that maximize goal outcomes, while ensuring there are no undesirable changes in guardrail…
-
Nonlinear Principal Component Analysis with Random Bernoulli Features for Process Monitoring
Nonlinear Principal Component Analysis with Random Bernoulli Features for Process Monitoring arXiv:2503.12456v1 Announce Type: new Abstract: The process generates substantial amounts of data with highly complex structures, leading to the development of numerous nonlinear statistical methods. However, most of these methods rely on computations involving large-scale dense kernel matrices. This dependence poses significant challenges in…
-
Learn then Decide: A Learning Approach for Designing Data Marketplaces
Learn then Decide: A Learning Approach for Designing Data Marketplaces arXiv:2503.10773v1 Announce Type: new Abstract: As data marketplaces become increasingly central to the digital economy, it is crucial to design efficient pricing mechanisms that optimize revenue while ensuring fair and adaptive pricing. We introduce the Maximum Auction-to-Posted Price (MAPP) mechanism, a novel two-stage approach that…
-
Exploiting Concavity Information in Gaussian Process Contextual Bandit Optimization
Exploiting Concavity Information in Gaussian Process Contextual Bandit Optimization arXiv:2503.10836v1 Announce Type: new Abstract: The contextual bandit framework is widely used to solve sequential optimization problems where the reward of each decision depends on auxiliary context variables. In settings such as medicine, business, and engineering, the decision maker often possesses additional structural information on the…
-
On the Identifiability of Causal Abstractions
On the Identifiability of Causal Abstractions arXiv:2503.10834v1 Announce Type: new Abstract: Causal representation learning (CRL) enhances machine learning models’ robustness and generalizability by learning structural causal models associated with data-generating processes. We focus on a family of CRL methods that uses contrastive data pairs in the observable space, generated before and after a random, unknown…
-
Mamba time series forecasting with uncertainty propagation
Mamba time series forecasting with uncertainty propagation arXiv:2503.10873v1 Announce Type: new Abstract: State space models, such as Mamba, have recently garnered attention in time series forecasting due to their ability to capture sequence patterns. However, in electricity consumption benchmarks, Mamba forecasts exhibit a mean error of approximately 8%. Similarly, in traffic occupancy benchmarks, the mean…
-
Clustering Items through Bandit Feedback: Finding the Right Feature out of Many
Clustering Items through Bandit Feedback: Finding the Right Feature out of Many arXiv:2503.11209v1 Announce Type: new Abstract: We study the problem of clustering a set of items based on bandit feedback. Each of the $n$ items is characterized by a feature vector, with a possibly large dimension $d$. The items are partitioned into two unknown…
-
Power Spectrum Signatures of Graphs
Power Spectrum Signatures of Graphs arXiv:2503.09660v1 Announce Type: new Abstract: Point signatures based on the Laplacian operators on graphs, point clouds, and manifolds have become popular tools in machine learning for graphs, clustering, and shape analysis. In this work, we propose a novel point signature, the power spectrum signature, a measure on $mathbb{R}$ defined as…
-
Explainable Bayesian deep learning through input-skip Latent Binary Bayesian Neural Networks
Explainable Bayesian deep learning through input-skip Latent Binary Bayesian Neural Networks arXiv:2503.10496v1 Announce Type: new Abstract: Modeling natural phenomena with artificial neural networks (ANNs) often provides highly accurate predictions. However, ANNs often suffer from over-parameterization, complicating interpretation and raising uncertainty issues. Bayesian neural networks (BNNs) address the latter by representing weights as probability distributions, allowing…
-
Sample and Map from a Single Convex Potential: Generation using Conjugate Moment Measures
Sample and Map from a Single Convex Potential: Generation using Conjugate Moment Measures arXiv:2503.10576v1 Announce Type: new Abstract: A common approach to generative modeling is to split model-fitting into two blocks: define first how to sample noise (e.g. Gaussian) and choose next what to do with it (e.g. using a single map or flows). We…
-
Technical Insights and Legal Considerations for Advancing Federated Learning in Bioinformatics
Technical Insights and Legal Considerations for Advancing Federated Learning in Bioinformatics arXiv:2503.09649v1 Announce Type: cross Abstract: Federated learning leverages data across institutions to improve clinical discovery while complying with data-sharing restrictions and protecting patient privacy. As the evolution of biobanks in genetics and systems biology has proved, accessing more extensive and varied data pools leads…
-
Bags of Projected Nearest Neighbours: Competitors to Random Forests?
Bags of Projected Nearest Neighbours: Competitors to Random Forests? arXiv:2503.09651v1 Announce Type: cross Abstract: In this paper we introduce a simple and intuitive adaptive k nearest neighbours classifier, and explore its utility within the context of bootstrap aggregating (“bagging”). The approach is based on finding discriminant subspaces which are computationally efficient to compute, and are…
-
Learning Pareto manifolds in high dimensions: How can regularization help?
Learning Pareto manifolds in high dimensions: How can regularization help? arXiv:2503.08849v1 Announce Type: new Abstract: Simultaneously addressing multiple objectives is becoming increasingly important in modern machine learning. At the same time, data is often high-dimensional and costly to label. For a single objective such as prediction risk, conventional regularization techniques are known to improve generalization…
-
A Deep Bayesian Nonparametric Framework for Robust Mutual Information Estimation
A Deep Bayesian Nonparametric Framework for Robust Mutual Information Estimation arXiv:2503.08902v1 Announce Type: new Abstract: Mutual Information (MI) is a crucial measure for capturing dependencies between variables, but exact computation is challenging in high dimensions with intractable likelihoods, impacting accuracy and robustness. One idea is to use an auxiliary neural network to train an MI…
-
Risk-sensitive Bandits: Arm Mixture Optimality and Regret-efficient Algorithms
Risk-sensitive Bandits: Arm Mixture Optimality and Regret-efficient Algorithms arXiv:2503.08896v1 Announce Type: new Abstract: This paper introduces a general framework for risk-sensitive bandits that integrates the notions of risk-sensitive objectives by adopting a rich class of distortion riskmetrics. The introduced framework subsumes the various existing risk-sensitive models. An important and hitherto unknown observation is that for…
-
Self-Consistent Equation-guided Neural Networks for Censored Time-to-Event Data
Self-Consistent Equation-guided Neural Networks for Censored Time-to-Event Data arXiv:2503.09097v1 Announce Type: new Abstract: In survival analysis, estimating the conditional survival function given predictors is often of interest. There is a growing trend in the development of deep learning methods for analyzing censored time-to-event data, especially when dealing with high-dimensional predictors that are complexly interrelated. Many…
-
Addressing pitfalls in implicit unobserved confounding synthesis using explicit block hierarchical ancestral sampling
Addressing pitfalls in implicit unobserved confounding synthesis using explicit block hierarchical ancestral sampling arXiv:2503.09194v1 Announce Type: new Abstract: Unbiased data synthesis is crucial for evaluating causal discovery algorithms in the presence of unobserved confounding, given the scarcity of real-world datasets. A common approach, implicit parameterization, encodes unobserved confounding by modifying the off-diagonal entries of the…
-
Probabilistic Shielding for Safe Reinforcement Learning
Probabilistic Shielding for Safe Reinforcement Learning arXiv:2503.07671v1 Announce Type: new Abstract: In real-life scenarios, a Reinforcement Learning (RL) agent aiming to maximise their reward, must often also behave in a safe manner, including at training time. Thus, much attention in recent years has been given to Safe RL, where an agent aims to learn an…
-
Personalized Convolutional Dictionary Learning of Physiological Time Series
Personalized Convolutional Dictionary Learning of Physiological Time Series arXiv:2503.07687v1 Announce Type: new Abstract: Human physiological signals tend to exhibit both global and local structures: the former are shared across a population, while the latter reflect inter-individual variability. For instance, kinetic measurements of the gait cycle during locomotion present common characteristics, although idiosyncrasies may be observed…
-
Uncertainty quantification and posterior sampling for network reconstruction
Uncertainty quantification and posterior sampling for network reconstruction arXiv:2503.07736v1 Announce Type: new Abstract: Network reconstruction is the task of inferring the unseen interactions between elements of a system, based only on their behavior or dynamics. This inverse problem is in general ill-posed, and admits many solutions for the same observation. Nevertheless, the vast majority of…
-
Cost-Aware Optimal Pairwise Pure Exploration
Cost-Aware Optimal Pairwise Pure Exploration arXiv:2503.07877v1 Announce Type: new Abstract: Pure exploration is one of the fundamental problems in multi-armed bandits (MAB). However, existing works mostly focus on specific pure exploration tasks, without a holistic view of the general pure exploration problem. This work fills this gap by introducing a versatile framework to study pure…
-
Pure Exploration with Feedback Graphs
Pure Exploration with Feedback Graphs arXiv:2503.07824v1 Announce Type: new Abstract: We study the sample complexity of pure exploration in an online learning problem with a feedback graph. This graph dictates the feedback available to the learner, covering scenarios between full-information, pure bandit feedback, and settings with no feedback on the chosen action. While variants of…