Category: cs.AI

Towards Interpretable Deep Generative Models via Causal Representation Learning

Towards Interpretable Deep Generative Models via Causal Representation Learning arXiv:2504.11609v1 Announce Type: new Abstract: Recent developments in generative artificial intelligence (AI) rely on machine learning techniques such as deep learning and generative modeling to achieve state-of-the-art performance across wide-ranging domains. These methods’ surprising performance is due in part to their ability to learn implicit “representations”…

April 17, 2025
AB-Cache: Training-Free Acceleration of Diffusion Models via Adams-Bashforth Cached Feature Reuse

AB-Cache: Training-Free Acceleration of Diffusion Models via Adams-Bashforth Cached Feature Reuse arXiv:2504.10540v1 Announce Type: new Abstract: Diffusion models have demonstrated remarkable success in generative tasks, yet their iterative denoising process results in slow inference, limiting their practicality. While existing acceleration methods exploit the well-known U-shaped similarity pattern between adjacent steps through caching mechanisms, they lack…

April 16, 2025
Energy Matching: Unifying Flow Matching and Energy-Based Models for Generative Modeling

Energy Matching: Unifying Flow Matching and Energy-Based Models for Generative Modeling arXiv:2504.10612v1 Announce Type: cross Abstract: Generative models often map noise to data by matching flows or scores, but these approaches become cumbersome for incorporating partial observations or additional priors. Inspired by recent advances in Wasserstein gradient flows, we propose Energy Matching, a framework that…

April 16, 2025
StealthRank: LLM Ranking Manipulation via Stealthy Prompt Optimization

StealthRank: LLM Ranking Manipulation via Stealthy Prompt Optimization arXiv:2504.05804v1 Announce Type: cross Abstract: The integration of large language models (LLMs) into information retrieval systems introduces new attack surfaces, particularly for adversarial ranking manipulations. We present StealthRank, a novel adversarial ranking attack that manipulates LLM-driven product recommendation systems while maintaining textual fluency and stealth. Unlike existing…

April 10, 2025
Hyperflows: Pruning Reveals the Importance of Weights

Hyperflows: Pruning Reveals the Importance of Weights arXiv:2504.05349v1 Announce Type: new Abstract: Network pruning is used to reduce inference latency and power consumption in large neural networks. However, most existing methods struggle to accurately assess the importance of individual weights due to their inherent interrelatedness, leading to poor performance, especially at extreme sparsity levels. We…

April 9, 2025
Robust Reinforcement Learning from Human Feedback for Large Language Models Fine-Tuning

Robust Reinforcement Learning from Human Feedback for Large Language Models Fine-Tuning arXiv:2504.03784v1 Announce Type: new Abstract: Reinforcement learning from human feedback (RLHF) has emerged as a key technique for aligning the output of large language models (LLMs) with human preferences. To learn the reward function, most existing RLHF algorithms use the Bradley-Terry model, which relies…

April 8, 2025
On Model Protection in Federated Learning against Eavesdropping Attacks

On Model Protection in Federated Learning against Eavesdropping Attacks arXiv:2504.02114v1 Announce Type: cross Abstract: In this study, we investigate the protection offered by federated learning algorithms against eavesdropping adversaries. In our model, the adversary is capable of intercepting model updates transmitted from clients to the server, enabling it to create its own estimate of the…

April 4, 2025
Towards Interpretable Soft Prompts

Towards Interpretable Soft Prompts arXiv:2504.02144v1 Announce Type: cross Abstract: Soft prompts have been popularized as a cheap and easy way to improve task-specific LLM performance beyond few-shot prompts. Despite their origin as an automated prompting method, however, soft prompts and other trainable prompts remain a black-box method with no immediately interpretable connections to prompting. We…

April 4, 2025
Backdoor Detection through Replicated Execution of Outsourced Training

Backdoor Detection through Replicated Execution of Outsourced Training arXiv:2504.00170v1 Announce Type: cross Abstract: It is common practice to outsource the training of machine learning models to cloud providers. Clients who do so gain from the cloud’s economies of scale, but implicitly assume trust: the server should not deviate from the client’s training procedure. A malicious…

April 2, 2025
Is Best-of-N the Best of Them? Coverage, Scaling, and Optimality in Inference-Time Alignment

Is Best-of-N the Best of Them? Coverage, Scaling, and Optimality in Inference-Time Alignment arXiv:2503.21878v1 Announce Type: cross Abstract: Inference-time computation provides an important axis for scaling language model performance, but naively scaling compute through techniques like Best-of-$N$ sampling can cause performance to degrade due to reward hacking. Toward a theoretical understanding of how to best…

March 31, 2025
Fundamental Safety-Capability Trade-offs in Fine-tuning Large Language Models

Fundamental Safety-Capability Trade-offs in Fine-tuning Large Language Models arXiv:2503.20807v1 Announce Type: new Abstract: Fine-tuning Large Language Models (LLMs) on some task-specific datasets has been a primary use of LLMs. However, it has been empirically observed that this approach to enhancing capability inevitably compromises safety, a phenomenon also known as the safety-capability trade-off in LLM fine-tuning.…

March 28, 2025
CAE: Repurposing the Critic as an Explorer in Deep Reinforcement Learning

CAE: Repurposing the Critic as an Explorer in Deep Reinforcement Learning arXiv:2503.18980v1 Announce Type: new Abstract: Exploration remains a critical challenge in reinforcement learning, as many existing methods either lack theoretical guarantees or fall short of practical effectiveness. In this paper, we introduce CAE, a lightweight algorithm that repurposes the value networks in standard deep…

March 26, 2025
Minimum Volume Conformal Sets for Multivariate Regression

Minimum Volume Conformal Sets for Multivariate Regression arXiv:2503.19068v1 Announce Type: new Abstract: Conformal prediction provides a principled framework for constructing predictive sets with finite-sample validity. While much of the focus has been on univariate response variables, existing multivariate methods either impose rigid geometric assumptions or rely on flexible but computationally expensive approaches that do not…

March 26, 2025
Micro Text Classification Based on Balanced Positive-Unlabeled Learning

Micro Text Classification Based on Balanced Positive-Unlabeled Learning arXiv:2503.13562v1 Announce Type: new Abstract: In real-world text classification tasks, negative texts often contain a minimal proportion of negative content, which is especially problematic in areas like text quality control, legal risk screening, and sensitive information interception. This challenge manifests at two levels: at the macro level,…

March 19, 2025
Optimizing ML Training with Metagradient Descent

Optimizing ML Training with Metagradient Descent arXiv:2503.13751v1 Announce Type: new Abstract: A major challenge in training large-scale machine learning models is configuring the training process to maximize model performance, i.e., finding the best training setup from a vast design space. In this work, we unlock a gradient-based approach to this problem. We first introduce an…

March 19, 2025
Explainable Bayesian deep learning through input-skip Latent Binary Bayesian Neural Networks

Explainable Bayesian deep learning through input-skip Latent Binary Bayesian Neural Networks arXiv:2503.10496v1 Announce Type: new Abstract: Modeling natural phenomena with artificial neural networks (ANNs) often provides highly accurate predictions. However, ANNs often suffer from over-parameterization, complicating interpretation and raising uncertainty issues. Bayesian neural networks (BNNs) address the latter by representing weights as probability distributions, allowing…

March 14, 2025
Probabilistic Shielding for Safe Reinforcement Learning

Probabilistic Shielding for Safe Reinforcement Learning arXiv:2503.07671v1 Announce Type: new Abstract: In real-life scenarios, a Reinforcement Learning (RL) agent aiming to maximise their reward, must often also behave in a safe manner, including at training time. Thus, much attention in recent years has been given to Safe RL, where an agent aims to learn an…

March 12, 2025
Exploring specialization and sensitivity of convolutional neural networks in the context of simultaneous image augmentations

Exploring specialization and sensitivity of convolutional neural networks in the context of simultaneous image augmentations arXiv:2503.03283v1 Announce Type: new Abstract: Drawing parallels with the way biological networks are studied, we adapt the treatment–control paradigm to explainable artificial intelligence research and enrich it through multi-parametric input alterations. In this study, we propose a framework for investigating…

March 6, 2025
Mathematical Foundation of Interpretable Equivariant Surrogate Models

Mathematical Foundation of Interpretable Equivariant Surrogate Models arXiv:2503.01942v1 Announce Type: new Abstract: This paper introduces a rigorous mathematical framework for neural network explainability, and more broadly for the explainability of equivariant operators called Group Equivariant Operators (GEOs) based on Group Equivariant Non-Expansive Operators (GENEOs) transformations. The central concept involves quantifying the distance between GEOs by…

March 5, 2025
LNUCB-TA: Linear-nonlinear Hybrid Bandit Learning with Temporal Attention

LNUCB-TA: Linear-nonlinear Hybrid Bandit Learning with Temporal Attention arXiv:2503.00387v1 Announce Type: new Abstract: Existing contextual multi-armed bandit (MAB) algorithms fail to effectively capture both long-term trends and local patterns across all arms, leading to suboptimal performance in environments with rapidly changing reward structures. They also rely on static exploration rates, which do not dynamically adjust…

March 4, 2025
Efficient Risk-sensitive Planning via Entropic Risk Measures

Efficient Risk-sensitive Planning via Entropic Risk Measures arXiv:2502.20423v1 Announce Type: new Abstract: Risk-sensitive planning aims to identify policies maximizing some tail-focused metrics in Markov Decision Processes (MDPs). Such an optimization task can be very costly for the most widely used and interpretable metrics such as threshold probabilities or (Conditional) Values at Risk. Indeed, previous work…

March 3, 2025
Practical Evaluation of Copula-based Survival Metrics: Beyond the Independent Censoring Assumption

Practical Evaluation of Copula-based Survival Metrics: Beyond the Independent Censoring Assumption arXiv:2502.19460v1 Announce Type: new Abstract: Conventional survival metrics, such as Harrell’s concordance index and the Brier Score, rely on the independent censoring assumption for valid inference in the presence of right-censored data. However, when instances are censored for reasons related to the event of…

February 28, 2025
Applications of Statistical Field Theory in Deep Learning

Applications of Statistical Field Theory in Deep Learning arXiv:2502.18553v1 Announce Type: new Abstract: Deep learning algorithms have made incredible strides in the past decade yet due to the complexity of these algorithms, the science of deep learning remains in its early stages. Being an experimentally driven field, it is natural to seek a theory of…

February 27, 2025
An Overview of Large Language Models for Statisticians

An Overview of Large Language Models for Statisticians arXiv:2502.17814v1 Announce Type: new Abstract: Large Language Models (LLMs) have emerged as transformative tools in artificial intelligence (AI), exhibiting remarkable capabilities across diverse tasks such as text generation, reasoning, and decision-making. While their success has primarily been driven by advances in computational power and deep learning architectures,…

February 26, 2025
Towards a perturbation-based explanation for medical AI as differentiable programs

Towards a perturbation-based explanation for medical AI as differentiable programs arXiv:2502.14001v1 Announce Type: new Abstract: Recent advancement in machine learning algorithms reaches a point where medical devices can be equipped with artificial intelligence (AI) models for diagnostic support and routine automation in clinical settings. In medicine and healthcare, there is a particular demand for sufficient…

February 21, 2025
Multi-Objective Bayesian Optimization for Networked Black-Box Systems: A Path to Greener Profits and Smarter Designs

Multi-Objective Bayesian Optimization for Networked Black-Box Systems: A Path to Greener Profits and Smarter Designs arXiv:2502.14121v1 Announce Type: new Abstract: Designing modern industrial systems requires balancing several competing objectives, such as profitability, resilience, and sustainability, while accounting for complex interactions between technological, economic, and environmental factors. Multi-objective optimization (MOO) methods are commonly used to navigate…

February 21, 2025
Suboptimal Shapley Value Explanations

Suboptimal Shapley Value Explanations arXiv:2502.12209v1 Announce Type: new Abstract: Deep Neural Networks (DNNs) have demonstrated strong capacity in supporting a wide variety of applications. Shapley value has emerged as a prominent tool to analyze feature importance to help people understand the inference process of deep neural models. Computing Shapley value function requires choosing a baseline…

February 19, 2025
The Majority Vote Paradigm Shift: When Popular Meets Optimal

The Majority Vote Paradigm Shift: When Popular Meets Optimal arXiv:2502.12581v1 Announce Type: new Abstract: Reliably labelling data typically requires annotations from multiple human workers. However, humans are far from being perfect. Hence, it is a common practice to aggregate labels gathered from multiple annotators to make a more confident estimate of the true label. Among…

February 19, 2025
Forecasting time series with constraints

Forecasting time series with constraints arXiv:2502.10485v1 Announce Type: new Abstract: Time series forecasting presents unique challenges that limit the effectiveness of traditional machine learning algorithms. To address these limitations, various approaches have incorporated linear constraints into learning algorithms, such as generalized additive models and hierarchical forecasting. In this paper, we propose a unified framework for…

February 18, 2025
Dynamic Influence Tracker: Measuring Time-Varying Sample Influence During Training

Dynamic Influence Tracker: Measuring Time-Varying Sample Influence During Training arXiv:2502.10793v1 Announce Type: new Abstract: Existing methods for measuring training sample influence on models only provide static, overall measurements, overlooking how sample influence changes during training. We propose Dynamic Influence Tracker (DIT), which captures the time-varying sample influence across arbitrary time windows during training. DIT offers…

February 18, 2025
SNAP: Sequential Non-Ancestor Pruning for Targeted Causal Effect Estimation With an Unknown Graph

SNAP: Sequential Non-Ancestor Pruning for Targeted Causal Effect Estimation With an Unknown Graph arXiv:2502.07857v1 Announce Type: new Abstract: Causal discovery can be computationally demanding for large numbers of variables. If we only wish to estimate the causal effects on a small subset of target variables, we might not need to learn the causal graph for…

February 13, 2025
Generative Distribution Prediction: A Unified Approach to Multimodal Learning

Generative Distribution Prediction: A Unified Approach to Multimodal Learning arXiv:2502.07090v1 Announce Type: new Abstract: Accurate prediction with multimodal data-encompassing tabular, textual, and visual inputs or outputs-is fundamental to advancing analytics in diverse application domains. Traditional approaches often struggle to integrate heterogeneous data types while maintaining high predictive accuracy. We introduce Generative Distribution Prediction (GDP), a…

February 12, 2025
On the Convergence and Stability of Upside-Down Reinforcement Learning, Goal-Conditioned Supervised Learning, and Online Decision Transformers

On the Convergence and Stability of Upside-Down Reinforcement Learning, Goal-Conditioned Supervised Learning, and Online Decision Transformers arXiv:2502.05672v1 Announce Type: new Abstract: This article provides a rigorous analysis of convergence and stability of Episodic Upside-Down Reinforcement Learning, Goal-Conditioned Supervised Learning and Online Decision Transformers. These algorithms performed competitively across various benchmarks, from games to robotic tasks,…

February 11, 2025
Two in context learning tasks with complex functions

Two in context learning tasks with complex functions arXiv:2502.03503v1 Announce Type: new Abstract: We examine two in context learning (ICL) tasks with mathematical functions in several train and test settings for transformer models. Our study generalizes work on linear functions by showing that small transformers, even models with attention layers only, can approximate arbitrary polynomial…

February 7, 2025
Doubly Robust Monte Carlo Tree Search

Doubly Robust Monte Carlo Tree Search arXiv:2502.01672v1 Announce Type: new Abstract: We present Doubly Robust Monte Carlo Tree Search (DR-MCTS), a novel algorithm that integrates Doubly Robust (DR) off-policy estimation into Monte Carlo Tree Search (MCTS) to enhance sample efficiency and decision quality in complex environments. Our approach introduces a hybrid estimator that combines MCTS…

February 5, 2025
Theoretical and Practical Analysis of Fr’echet Regression via Comparison Geometry

Theoretical and Practical Analysis of Fr’echet Regression via Comparison Geometry arXiv:2502.01995v1 Announce Type: new Abstract: Fr’echet regression extends classical regression methods to non-Euclidean metric spaces, enabling the analysis of data relationships on complex structures such as manifolds and graphs. This work establishes a rigorous theoretical analysis for Fr’echet regression through the lens of comparison geometry…

February 5, 2025
Learning to Fuse Temporal Proximity Networks: A Case Study in Chimpanzee Social Interactions

Learning to Fuse Temporal Proximity Networks: A Case Study in Chimpanzee Social Interactions arXiv:2502.00302v1 Announce Type: new Abstract: How can we identify groups of primate individuals which could be conjectured to drive social structure? To address this question, one of us has collected a time series of data for social interactions between chimpanzees. Here we…

February 4, 2025
Knoop: Practical Enhancement of Knockoff with Over-Parameterization for Variable Selection

Knoop: Practical Enhancement of Knockoff with Over-Parameterization for Variable Selection arXiv:2501.17889v1 Announce Type: new Abstract: Variable selection plays a crucial role in enhancing modeling effectiveness across diverse fields, addressing the challenges posed by high-dimensional datasets of correlated variables. This work introduces a novel approach namely Knockoff with over-parameterization (Knoop) to enhance Knockoff filters for variable…

January 31, 2025
Exact characterization of {epsilon}-Safe Decision Regions for exponential family distributions and Multi Cost SVM approximation

Exact characterization of {epsilon}-Safe Decision Regions for exponential family distributions and Multi Cost SVM approximation arXiv:2501.17731v1 Announce Type: new Abstract: Probabilistic guarantees on the prediction of data-driven classifiers are necessary to define models that can be considered reliable. This is a key requirement for modern machine learning in which the goodness of a system is…

January 30, 2025
ED-Filter: Dynamic Feature Filtering for Eating Disorder Classification

ED-Filter: Dynamic Feature Filtering for Eating Disorder Classification arXiv:2501.14785v1 Announce Type: new Abstract: Eating disorders (ED) are critical psychiatric problems that have alarmed the mental health community. Mental health professionals are increasingly recognizing the utility of data derived from social media platforms such as Twitter. However, high dimensionality and extensive feature sets of Twitter data…

January 28, 2025
Explaining Categorical Feature Interactions Using Graph Covariance and LLMs

Explaining Categorical Feature Interactions Using Graph Covariance and LLMs arXiv:2501.14932v1 Announce Type: new Abstract: Modern datasets often consist of numerous samples with abundant features and associated timestamps. Analyzing such datasets to uncover underlying events typically requires complex statistical methods and substantial domain expertise. A notable example, and the primary data focus of this paper, is…

January 28, 2025
Causal vs. Anticausal merging of predictors

Causal vs. Anticausal merging of predictors arXiv:2501.08426v1 Announce Type: cross Abstract: We study the differences arising from merging predictors in the causal and anticausal directions using the same data. In particular we study the asymmetries that arise in a simple model where we merge the predictors using one binary variable as target and two continuous…

January 16, 2025
On the Statistical Capacity of Deep Generative Models

On the Statistical Capacity of Deep Generative Models arXiv:2501.07763v1 Announce Type: new Abstract: Deep generative models are routinely used in generating samples from complex, high-dimensional distributions. Despite their apparent successes, their statistical properties are not well understood. A common assumption is that with enough training data and sufficiently large neural networks, deep generative model samples…

January 15, 2025
Circuit Complexity Bounds for Visual Autoregressive Model

Circuit Complexity Bounds for Visual Autoregressive Model arXiv:2501.04299v1 Announce Type: new Abstract: Understanding the expressive ability of a specific model is essential for grasping its capacity limitations. Recently, several studies have established circuit complexity bounds for Transformer architecture. Besides, the Visual AutoRegressive (VAR) model has risen to be a prominent method in the field of…

January 9, 2025
Who Wrote This? Zero-Shot Statistical Tests for LLM-Generated Text Detection using Finite Sample Concentration Inequalities

Who Wrote This? Zero-Shot Statistical Tests for LLM-Generated Text Detection using Finite Sample Concentration Inequalities arXiv:2501.02406v1 Announce Type: new Abstract: Verifying the provenance of content is crucial to the function of many organizations, e.g., educational institutions, social media platforms, firms, etc. This problem is becoming increasingly difficult as text generated by Large Language Models (LLMs)…

January 7, 2025
Efficient Human-in-the-Loop Active Learning: A Novel Framework for Data Labeling in AI Systems

Efficient Human-in-the-Loop Active Learning: A Novel Framework for Data Labeling in AI Systems arXiv:2501.00277v1 Announce Type: new Abstract: Modern AI algorithms require labeled data. In real world, majority of data are unlabeled. Labeling the data are costly. this is particularly true for some areas requiring special skills, such as reading radiology images by physicians. To…

January 3, 2025
Fr’echet regression for multi-label feature selection with implicit regularization

Fr’echet regression for multi-label feature selection with implicit regularization arXiv:2412.18247v1 Announce Type: new Abstract: Fr’echet regression extends linear regression to model complex responses in metric spaces, making it particularly relevant for multi-label regression, where each instance can have multiple associated labels. However, variable selection within this framework remains underexplored. In this paper, we pro pose…

December 25, 2024
A Statistical Framework for Ranking LLM-Based Chatbots

A Statistical Framework for Ranking LLM-Based Chatbots arXiv:2412.18407v1 Announce Type: new Abstract: Large language models (LLMs) have transformed natural language processing, with frameworks like Chatbot Arena providing pioneering platforms for evaluating these models. By facilitating millions of pairwise comparisons based on human judgments, Chatbot Arena has become a cornerstone in LLM evaluation, offering rich datasets…

December 25, 2024
Sequential Controlled Langevin Diffusions

Sequential Controlled Langevin Diffusions arXiv:2412.07081v1 Announce Type: new Abstract: An effective approach for sampling from unnormalized densities is based on the idea of gradually transporting samples from an easy prior to the complicated target distribution. Two popular methods are (1) Sequential Monte Carlo (SMC), where the transport is performed through successive annealed densities via prescribed…

December 11, 2024
Training-Free Bayesianization for Low-Rank Adapters of Large Language Models

Training-Free Bayesianization for Low-Rank Adapters of Large Language Models arXiv:2412.05723v1 Announce Type: new Abstract: Estimating the uncertainty of responses of Large Language Models~(LLMs) remains a critical challenge. While recent Bayesian methods have demonstrated effectiveness in quantifying uncertainty through low-rank weight updates, they typically require complex fine-tuning or post-training procedures. In this paper, we propose Training-Free…

December 10, 2024
Disentangled Representation Learning for Causal Inference with Instruments

Disentangled Representation Learning for Causal Inference with Instruments arXiv:2412.04641v1 Announce Type: cross Abstract: Latent confounders are a fundamental challenge for inferring causal effects from observational data. The instrumental variable (IV) approach is a practical way to address this challenge. Existing IV based estimators need a known IV or other strong assumptions, such as the existence…

December 9, 2024
Selective Reviews of Bandit Problems in AI via a Statistical View

Selective Reviews of Bandit Problems in AI via a Statistical View arXiv:2412.02251v1 Announce Type: new Abstract: Reinforcement Learning (RL) is a widely researched area in artificial intelligence that focuses on teaching agents decision-making through interactions with their environment. A key subset includes stochastic multi-armed bandit (MAB) and continuum-armed bandit (SCAB) problems, which model sequential decision-making…

December 4, 2024
Deep Matrix Factorization with Adaptive Weights for Multi-View Clustering

Deep Matrix Factorization with Adaptive Weights for Multi-View Clustering arXiv:2412.02292v1 Announce Type: new Abstract: Recently, deep matrix factorization has been established as a powerful model for unsupervised tasks, achieving promising results, especially for multi-view clustering. However, existing methods often lack effective feature selection mechanisms and rely on empirical hyperparameter selection. To address these issues, we…

December 4, 2024
Composition of Experts: A Modular Compound AI System Leveraging Large Language Models

Composition of Experts: A Modular Compound AI System Leveraging Large Language Models arXiv:2412.01868v1 Announce Type: cross Abstract: Large Language Models (LLMs) have achieved remarkable advancements, but their monolithic nature presents challenges in terms of scalability, cost, and customization. This paper introduces the Composition of Experts (CoE), a modular compound AI system leveraging multiple expert LLMs.…

December 4, 2024
Explicit and data-Efficient Encoding via Gradient Flow

Explicit and data-Efficient Encoding via Gradient Flow arXiv:2412.00864v1 Announce Type: new Abstract: The autoencoder model typically uses an encoder to map data to a lower dimensional latent space and a decoder to reconstruct it. However, relying on an encoder for inversion can lead to suboptimal representations, particularly limiting in physical sciences where precision is key.…

December 3, 2024
The Return of Pseudosciences in Artificial Intelligence: Have Machine Learning and Deep Learning Forgotten Lessons from Statistics and History?

The Return of Pseudosciences in Artificial Intelligence: Have Machine Learning and Deep Learning Forgotten Lessons from Statistics and History? arXiv:2411.18656v1 Announce Type: new Abstract: In today’s world, AI programs powered by Machine Learning are ubiquitous, and have achieved seemingly exceptional performance across a broad range of tasks, from medical diagnosis and credit rating in banking,…

December 2, 2024
Contrastive representations of high-dimensional, structured treatments

Contrastive representations of high-dimensional, structured treatments arXiv:2411.19245v1 Announce Type: new Abstract: Estimating causal effects is vital for decision making. In standard causal effect estimation, treatments are usually binary- or continuous-valued. However, in many important real-world settings, treatments can be structured, high-dimensional objects, such as text, video, or audio. This provides a challenge to traditional causal…

December 2, 2024
Isometry pursuit

Isometry pursuit arXiv:2411.18502v1 Announce Type: new Abstract: Isometry pursuit is a convex algorithm for identifying orthonormal column-submatrices of wide matrices. It consists of a novel normalization method followed by multitask basis pursuit. Applied to Jacobians of putative coordinate functions, it helps identity isometric embeddings from within interpretable dictionaries. We provide theoretical and experimental results justifying…

November 28, 2024
Functional relevance based on the continuous Shapley value

Functional relevance based on the continuous Shapley value arXiv:2411.18575v1 Announce Type: new Abstract: The presence of Artificial Intelligence (AI) in our society is increasing, which brings with it the need to understand the behaviour of AI mechanisms, including machine learning predictive algorithms fed with tabular data, text, or images, among other types of data. This…

November 28, 2024