Category: aimldsaimlds
-
Density estimation via mixture discrepancy and moments
Density estimation via mixture discrepancy and moments arXiv:2504.01570v1 Announce Type: new Abstract: With the aim of generalizing histogram statistics to higher dimensional cases, density estimation via discrepancy based sequential partition (DSP) has been proposed [D. Li, K. Yang, W. Wong, Advances in Neural Information Processing Systems (2016) 1099-1107] to learn an adaptive piecewise constant approximation…
-
Denoising guarantees for optimized sampling schemes in compressed sensing
Denoising guarantees for optimized sampling schemes in compressed sensing arXiv:2504.01046v1 Announce Type: new Abstract: Compressed sensing with subsampled unitary matrices benefits from emph{optimized} sampling schemes, which feature improved theoretical guarantees and empirical performance relative to uniform subsampling. We provide, in a first of its kind in compressed sensing, theoretical guarantees showing that the error caused…
-
Sparse Gaussian Neural Processes
Sparse Gaussian Neural Processes arXiv:2504.01650v1 Announce Type: new Abstract: Despite significant recent advances in probabilistic meta-learning, it is common for practitioners to avoid using deep learning models due to a comparative lack of interpretability. Instead, many practitioners simply use non-meta-models such as Gaussian processes with interpretable priors, and conduct the tedious procedure of training their…
-
Agentic GraphRAG for Commercial Contracts
Agentic GraphRAG for Commercial Contracts In every business, legal contracts are foundational documents that define the relationships, obligations, and responsibilities between parties. Whether it’s a partnership agreement, an NDA, or a supplier contract, these documents often contain critical information that drives decision-making, risk management, and compliance. However, navigating and extracting insights from these contracts can…
-
The Art of Noise
The Art of Noise Introduction In my last several articles I talked about generative deep learning algorithms, which mostly are related to text generation tasks. So, I think it would be interesting to switch to generative algorithms for image generation now. We knew that nowadays there have been plenty of deep learning models specialized for…
-
PyScript vs. JavaScript: A Battle of Web Titans
PyScript vs. JavaScript: A Battle of Web Titans We’re delving into frontend web development today, and you might be thinking: what does this have to do with Data Science? Why is Towards Data Science publishing a post related to web dev? Well, because data science isn’t only about building powerful models, engaging in advanced analytics,…
-
Privacy-Preserving Transfer Learning for Community Detection using Locally Distributed Multiple Networks
Privacy-Preserving Transfer Learning for Community Detection using Locally Distributed Multiple Networks arXiv:2504.00890v1 Announce Type: new Abstract: This paper develops a new spectral clustering-based method called TransNet for transfer learning in community detection of network data. Our goal is to improve the clustering performance of the target network using auxiliary source networks, which are heterogeneous, privacy-preserved,…
-
Communication-Efficient l_0 Penalized Least Square
Communication-Efficient l_0 Penalized Least Square arXiv:2504.00722v1 Announce Type: new Abstract: In this paper, we propose a communication-efficient penalized regression algorithm for high-dimensional sparse linear regression models with massive data. This approach incorporates an optimized distributed system communication algorithm, named CESDAR algorithm, based on the Enhanced Support Detection and Root finding algorithm. The CESDAR algorithm leverages…
-
A formula for the area of a triangle: Useless, but explicitly in Deep Sets form
A formula for the area of a triangle: Useless, but explicitly in Deep Sets form arXiv:2503.22786v1 Announce Type: cross Abstract: Any permutation-invariant function of data points $vec{r}_i$ can be written in the form $rho(sum_iphi(vec{r}_i))$ for suitable functions $rho$ and $phi$. This form – known in the machine-learning literature as Deep Sets – also generates a…
-
Nuclear Microreactor Control with Deep Reinforcement Learning
Nuclear Microreactor Control with Deep Reinforcement Learning arXiv:2504.00156v1 Announce Type: cross Abstract: The economic feasibility of nuclear microreactors will depend on minimizing operating costs through advancements in autonomous control, especially when these microreactors are operating alongside other types of energy systems (e.g., renewable energy). This study explores the application of deep reinforcement learning (RL) for…
-
Backdoor Detection through Replicated Execution of Outsourced Training
Backdoor Detection through Replicated Execution of Outsourced Training arXiv:2504.00170v1 Announce Type: cross Abstract: It is common practice to outsource the training of machine learning models to cloud providers. Clients who do so gain from the cloud’s economies of scale, but implicitly assume trust: the server should not deviate from the client’s training procedure. A malicious…
-
The Case for Centralized AI Model Inference Serving
The Case for Centralized AI Model Inference Serving As AI models continue to increase in scope and accuracy, even tasks once dominated by traditional algorithms are gradually being replaced by Deep Learning models. Algorithmic pipelines — workflows that take an input, process it through a series of algorithms, and produce an output — increasingly rely…
-
4 Levels of GitHub Actions: A Guide to Data Workflow Automation
4 Levels of GitHub Actions: A Guide to Data Workflow Automation Automation has become an indispensable element for ensuring operational efficiency and reliability in modern software development. GitHub Actions, an integrated Continuous Integration and Continuous Deployment (CI/CD) tool within GitHub, has established its position in the software development industry by providing a comprehensive platform for…
-
Agentic AI: Single vs Multi-Agent Systems
Agentic AI: Single vs Multi-Agent Systems We’ve seen this shift the last few years from building rigid programming systems to natural language-driven workflows, all made possible with more advanced large language models. One of the interesting areas into these Agentic Ai systems is the difference between building a single versus multi-agent workflow, or perhaps the…
-
DGSAM: Domain Generalization via Individual Sharpness-Aware Minimization
DGSAM: Domain Generalization via Individual Sharpness-Aware Minimization arXiv:2503.23430v1 Announce Type: new Abstract: Domain generalization (DG) aims to learn models that can generalize well to unseen domains by training only on a set of source domains. Sharpness-Aware Minimization (SAM) has been a popular approach for this, aiming to find flat minima in the total loss landscape.…
-
Accelerated Stein Variational Gradient Flow
Accelerated Stein Variational Gradient Flow arXiv:2503.23462v1 Announce Type: new Abstract: Stein variational gradient descent (SVGD) is a kernel-based particle method for sampling from a target distribution, e.g., in generative modeling and Bayesian inference. SVGD does not require estimating the gradient of the log-density, which is called score estimation. In practice, SVGD can be slow compared…
-
Scalable Geometric Learning with Correlation-Based Functional Brain Networks
Scalable Geometric Learning with Correlation-Based Functional Brain Networks arXiv:2503.23653v1 Announce Type: new Abstract: The correlation matrix is a central representation of functional brain networks in neuroimaging. Traditional analyses often treat pairwise interactions independently in a Euclidean setting, overlooking the intrinsic geometry of correlation matrices. While earlier attempts have embraced the quotient geometry of the correlation…
-
Learning a Single Index Model from Anisotropic Data with vanilla Stochastic Gradient Descent
Learning a Single Index Model from Anisotropic Data with vanilla Stochastic Gradient Descent arXiv:2503.23642v1 Announce Type: new Abstract: We investigate the problem of learning a Single Index Model (SIM)- a popular model for studying the ability of neural networks to learn features – from anisotropic Gaussian inputs by training a neuron using vanilla Stochastic Gradient…
-
Feature learning from non-Gaussian inputs: the case of Independent Component Analysis in high dimensions
Feature learning from non-Gaussian inputs: the case of Independent Component Analysis in high dimensions arXiv:2503.23896v1 Announce Type: new Abstract: Deep neural networks learn structured features from complex, non-Gaussian inputs, but the mechanisms behind this process remain poorly understood. Our work is motivated by the observation that the first-layer filters learnt by deep convolutional neural networks…
-
Graph Neural Networks Part 3: How GraphSAGE Handles Changing Graph Structure
Graph Neural Networks Part 3: How GraphSAGE Handles Changing Graph Structure In the previous parts of this series, we looked at Graph Convolutional Networks (GCNs) and Graph Attention Networks (GATs). Both architectures work fine, but they also have some limitations! A big one is that for large graphs, calculating the node representations with GCNs and…
-
A Simple Implementation of the Attention Mechanism from Scratch
A Simple Implementation of the Attention Mechanism from Scratch Introduction The Attention Mechanism is often associated with the transformer architecture, but it was already used in RNNs. In Machine Translation or MT (e.g., English-Italian) tasks, when you want to predict the next Italian word, you need your model to focus, or pay attention, on the…
-
Create Your Supply Chain Analytics Portfolio to Land Your Dream Job
Create Your Supply Chain Analytics Portfolio to Land Your Dream Job Supply chains are under pressure like never before. From climate-driven disruptions to geopolitical shifts, businesses must adapt to rising costs, new trade barriers and growing sustainability demands. In this new world where supply chains face uncertainty, Supply Chain Analytics is essential to keep resilient operations. Samir, can…
-
Understanding the Tech Stack Behind Generative AI
Understanding the Tech Stack Behind Generative AI Understanding the Tech Stack Behind Generative AI When ChatGPT reached the one million user mark within five days and took off faster than any other technology in history, the world began to pay attention to artificial intelligence and AI applications. And so it continued apace. Since then, many…
-
My Learning to Be Hired Again After a Year… Part 2
My Learning to Be Hired Again After a Year… Part 2 This is the second part of “My learning to being hired again after a year… Part I”. Hard to believe, but it’s been a full year since I published the first part on TDS. And in that time, something beautiful happened. Every so often,…
-
Structured and sparse partial least squares coherence for multivariate cortico-muscular analysis
Structured and sparse partial least squares coherence for multivariate cortico-muscular analysis arXiv:2503.21802v1 Announce Type: cross Abstract: Multivariate cortico-muscular analysis has recently emerged as a promising approach for evaluating the corticospinal neural pathway. However, current multivariate approaches encounter challenges such as high dimensionality and limited sample sizes, thus restricting their further applications. In this paper, we…
-
Is Best-of-N the Best of Them? Coverage, Scaling, and Optimality in Inference-Time Alignment
Is Best-of-N the Best of Them? Coverage, Scaling, and Optimality in Inference-Time Alignment arXiv:2503.21878v1 Announce Type: cross Abstract: Inference-time computation provides an important axis for scaling language model performance, but naively scaling compute through techniques like Best-of-$N$ sampling can cause performance to degrade due to reward hacking. Toward a theoretical understanding of how to best…
-
An Artificial Trend Index for Private Consumption Using Google Trends
An Artificial Trend Index for Private Consumption Using Google Trends arXiv:2503.21981v1 Announce Type: cross Abstract: In recent years, the use of databases that analyze trends, sentiments or news to make economic projections or create indicators has gained significant popularity, particularly with the Google Trends platform. This article explores the potential of Google search data to…
-
Rolled Gaussian process models for curves on manifolds
Rolled Gaussian process models for curves on manifolds arXiv:2503.21980v1 Announce Type: cross Abstract: Given a planar curve, imagine rolling a sphere along that curve without slipping or twisting, and by this means tracing out a curve on the sphere. It is well known that such a rolling operation induces a local isometry between the sphere…
-
Improving Equivariant Networks with Probabilistic Symmetry Breaking
Improving Equivariant Networks with Probabilistic Symmetry Breaking arXiv:2503.21985v1 Announce Type: cross Abstract: Equivariance encodes known symmetries into neural networks, often enhancing generalization. However, equivariant networks cannot break symmetries: the output of an equivariant network must, by definition, have at least the same self-symmetries as the input. This poses an important problem, both (1) for prediction…
-
Weekly Entering & Transitioning – Thread 31 Mar, 2025 – 07 Apr, 2025
Weekly Entering & Transitioning – Thread 31 Mar, 2025 – 07 Apr, 2025 Welcome to this week’s entering & transitioning thread! This thread is for any questions about getting started, studying, or transitioning into the data science field. Topics include: Learning resources (e.g. books, tutorials, videos) Traditional education (e.g. schools, degrees, electives) Alternative education (e.g.…
-
The Art of Hybrid Architectures
The Art of Hybrid Architectures In my previous article, I discussed how morphological feature extractors mimic the way biological experts visually assess images. This time, I want to go a step further and explore a new question:Can different architectures complement each other to build an AI that “sees” like an expert? Introduction: Rethinking Model Architecture…
-
A Little More Conversation, A Little Less Action — A Case Against Premature Data Integration
A Little More Conversation, A Little Less Action — A Case Against Premature Data Integration When I talk to [large] organisations that have not yet properly started with Data Science (DS) and Machine Learning (ML), they often tell me that they have to run a data integration project first, because “…all the data is scattered…
-
Master the 3D Reconstruction Process: A Step-by-Step Guide
Master the 3D Reconstruction Process: A Step-by-Step Guide The 3d Reconstruction journey from 2D photographs to 3D models follows a structured path. This path consists of distinct steps that build upon each other to transform flat images into spatial information. Understanding this pipeline is crucial for anyone looking to create high-quality 3D reconstructions. Let me…
-
AI Agents from Zero to Hero — Part 3
AI Agents from Zero to Hero — Part 3 Intro In Part 1 of this tutorial series, we introduced AI Agents, autonomous programs that perform tasks, make decisions, and communicate with others. In Part 2 of this tutorial series, we understood how to make the Agent try and retry until the task is completed through…
-
From Physics to Probability: Hamiltonian Mechanics for Generative Modeling and MCMC
From Physics to Probability: Hamiltonian Mechanics for Generative Modeling and MCMC Phase space of a nonlinear pendulum. Photo by the author. Hamiltonian mechanics is a way to describe how physical systems, like planets or pendulums, move over time, focusing on energy rather than just forces. By reframing complex dynamics through energy lenses, this 19th-century physics…
-
Fundamental Safety-Capability Trade-offs in Fine-tuning Large Language Models
Fundamental Safety-Capability Trade-offs in Fine-tuning Large Language Models arXiv:2503.20807v1 Announce Type: new Abstract: Fine-tuning Large Language Models (LLMs) on some task-specific datasets has been a primary use of LLMs. However, it has been empirically observed that this approach to enhancing capability inevitably compromises safety, a phenomenon also known as the safety-capability trade-off in LLM fine-tuning.…
-
Squared families: Searching beyond regular probability models
Squared families: Searching beyond regular probability models arXiv:2503.21128v1 Announce Type: new Abstract: We introduce squared families, which are families of probability densities obtained by squaring a linear transformation of a statistic. Squared families are singular, however their singularity can easily be handled so that they form regular models. After handling the singularity, squared families possess…
-
Debiasing Kernel-Based Generative Models
Debiasing Kernel-Based Generative Models arXiv:2503.20825v1 Announce Type: new Abstract: We propose a novel two-stage framework of generative models named Debiasing Kernel-Based Generative Models (DKGM) with the insights from kernel density estimation (KDE) and stochastic approximation. In the first stage of DKGM, we employ KDE to bypass the obstacles in estimating the density of data without…
-
DeepRV: pre-trained spatial priors for accelerated disease mapping
DeepRV: pre-trained spatial priors for accelerated disease mapping arXiv:2503.21473v1 Announce Type: new Abstract: Recently introduced prior-encoding deep generative models (e.g., PriorVAE, $pi$VAE, and PriorCVAE) have emerged as powerful tools for scalable Bayesian inference by emulating complex stochastic processes like Gaussian processes (GPs). However, these methods remain largely a proof-of-concept and inaccessible to practitioners. We propose…
-
Constraint-based causal discovery with tiered background knowledge and latent variables in single or overlapping datasets
Constraint-based causal discovery with tiered background knowledge and latent variables in single or overlapping datasets arXiv:2503.21526v1 Announce Type: new Abstract: In this paper we consider the use of tiered background knowledge within constraint based causal discovery. Our focus is on settings relaxing causal sufficiency, i.e. allowing for latent variables which may arise because relevant information…
-
Data Science: From School to Work, Part III
Data Science: From School to Work, Part III Introduction Writing code is about solving problems, but not every problem is predictable. In the real world, your software will encounter unexpected situations: missing files, invalid user inputs, network timeouts, or even hardware failures. This is why handling errors isn’t just a nice-to-have; it’s a critical part…
-
Japanese-Chinese Translation with GenAI: What Works and What Doesn’t
Japanese-Chinese Translation with GenAI: What Works and What Doesn’t Authors Alex (Qian) Wan: Alex (Qian) is a designer specializing in AI for B2B products. She is currently working at Microsoft, focusing on machine learning and Copilot for data analysis. Previously, she was the Gen AI design lead at VMware.Eli Ruoyong Hong : Eli is a…
-
Talk to Videos
Talk to Videos Large language models (LLMs) are improving in efficiency and are now able to understand different data formats, offering possibilities for myriads of applications in different domains. Initially, LLMs were inherently able to process only text. The image understanding feature was integrated by coupling an LLM with another image encoding model. However, gpt-4o…
-
A stochastic gradient descent algorithm with random search directions
A stochastic gradient descent algorithm with random search directions arXiv:2503.19942v1 Announce Type: new Abstract: Stochastic coordinate descent algorithms are efficient methods in which each iterate is obtained by fixing most coordinates at their values from the current iteration, and approximately minimizing the objective with respect to the remaining coordinates. However, this approach is usually restricted…
-
On the Robustness of Kernel Ridge Regression Using the Cauchy Loss Function
On the Robustness of Kernel Ridge Regression Using the Cauchy Loss Function arXiv:2503.20120v1 Announce Type: new Abstract: Robust regression aims to develop methods for estimating an unknown regression function in the presence of outliers, heavy-tailed distributions, or contaminated data, which can severely impact performance. Most existing theoretical results in robust regression assume that the noise…
-
Learning Data-Driven Uncertainty Set Partitions for Robust and Adaptive Energy Forecasting with Missing Data
Learning Data-Driven Uncertainty Set Partitions for Robust and Adaptive Energy Forecasting with Missing Data arXiv:2503.20410v1 Announce Type: new Abstract: Short-term forecasting models typically assume the availability of input data (features) when they are deployed and in use. However, equipment failures, disruptions, cyberattacks, may lead to missing features when such models are used operationally, which could…
-
An $(epsilon,delta)$-accurate level set estimation with a stopping criterion
An $(epsilon,delta)$-accurate level set estimation with a stopping criterion arXiv:2503.20272v1 Announce Type: new Abstract: The level set estimation problem seeks to identify regions within a set of candidate points where an unknown and costly to evaluate function’s value exceeds a specified threshold, providing an efficient alternative to exhaustive evaluations of function values. Traditional methods often…
-
Regression-Based Estimation of Causal Effects in the Presence of Selection Bias and Confounding
Regression-Based Estimation of Causal Effects in the Presence of Selection Bias and Confounding arXiv:2503.20546v1 Announce Type: new Abstract: We consider the problem of estimating the expected causal effect $E[Y|do(X)]$ for a target variable $Y$ when treatment $X$ is set by intervention, focusing on continuous random variables. In settings without selection bias or confounding, $E[Y|do(X)] =…
-
AI Agents from Zero to Hero — Part 2
AI Agents from Zero to Hero — Part 2 Intro In Part 1 of this tutorial series, we introduced AI Agents, autonomous programs that perform tasks, make decisions, and communicate with others. Agents perform actions through Tools. It might happen that a Tool doesn’t work on the first try, or that multiple Tools must be…
-
Automate Supply Chain Analytics Workflows with AI Agents using n8n
Automate Supply Chain Analytics Workflows with AI Agents using n8n Why build things the hard way when you can design them the smart way? As a Supply Chain Data Scientist, I’ve explored various frameworks like LangChain and LangGraph to build AI agents using Python. Leveraging LLMs with LangChain for Supply Chain Analytics — A Control Tower Powered by…
-
Uncertainty Quantification in Machine Learning with an Easy Python Interface
Uncertainty Quantification in Machine Learning with an Easy Python Interface Uncertainty quantification (UQ) in a Machine Learning (ML) model allows one to estimate the precision of its predictions. This is extremely important for utilizing its predictions in real-world tasks. For instance, if a machine learning model is trained to predict a property of a material,…
-
CAE: Repurposing the Critic as an Explorer in Deep Reinforcement Learning
CAE: Repurposing the Critic as an Explorer in Deep Reinforcement Learning arXiv:2503.18980v1 Announce Type: new Abstract: Exploration remains a critical challenge in reinforcement learning, as many existing methods either lack theoretical guarantees or fall short of practical effectiveness. In this paper, we introduce CAE, a lightweight algorithm that repurposes the value networks in standard deep…
-
Minimum Volume Conformal Sets for Multivariate Regression
Minimum Volume Conformal Sets for Multivariate Regression arXiv:2503.19068v1 Announce Type: new Abstract: Conformal prediction provides a principled framework for constructing predictive sets with finite-sample validity. While much of the focus has been on univariate response variables, existing multivariate methods either impose rigid geometric assumptions or rely on flexible but computationally expensive approaches that do not…
-
Centroid Decision Forest
Centroid Decision Forest arXiv:2503.19306v1 Announce Type: new Abstract: This paper introduces the centroid decision forest (CDF), a novel ensemble learning framework that redefines the splitting strategy and tree building in the ordinary decision trees for high-dimensional classification. The splitting approach in CDF differs from the traditional decision trees in theat the class separability score (CSS)…
-
Universal Architectures for the Learning of Polyhedral Norms and Convex Regularization Functionals
Universal Architectures for the Learning of Polyhedral Norms and Convex Regularization Functionals arXiv:2503.19190v1 Announce Type: new Abstract: This paper addresses the task of learning convex regularizers to guide the reconstruction of images from limited data. By imposing that the reconstruction be amplitude-equivariant, we narrow down the class of admissible functionals to those that can be…
-
Causal Bayesian Optimization with Unknown Graphs
Causal Bayesian Optimization with Unknown Graphs arXiv:2503.19554v1 Announce Type: new Abstract: Causal Bayesian Optimization (CBO) is a methodology designed to optimize an outcome variable by leveraging known causal relationships through targeted interventions. Traditional CBO methods require a fully and accurately specified causal graph, which is a limitation in many real-world scenarios where such graphs are…
-
The Ultimate AI/ML Roadmap For Beginners
The Ultimate AI/ML Roadmap For Beginners AI is transforming the way businesses operate, and nearly every company is exploring how to leverage this technology. As a result, the demand for AI and machine learning skills has skyrocketed in recent years. With nearly four years of experience in AI/ML, I’ve decided to create the ultimate guide…
-
Attractors in Neural Network Circuits: Beauty and Chaos
Attractors in Neural Network Circuits: Beauty and Chaos The state space of the first two neuron activations over time follows an attractor. What is one thing in common between memories, oscillating chemical reactions and double pendulums? All these systems have a basin of attraction for possible states, like a magnet that draws the system towards certain…
-
Data-Driven March Madness Predictions
Data-Driven March Madness Predictions March Madness is infamously unpredictable, a perfect storm where favorites tumble and underdogs rise to do the impossible. Every March, 64 men’s and 64 women’s College Basketball teams battle for glory, while millions of fans, analysts, and betting markets scramble to predict the outcomes. But the odds of picking a perfect…
-
Testing the Power of Multimodal AI Systems in Reading and Interpreting Photographs, Maps, Charts and More
Testing the Power of Multimodal AI Systems in Reading and Interpreting Photographs, Maps, Charts and More Introduction It’s no news that artificial intelligence has made huge strides in recent years, particularly with the advent of multimodal models that can process and create both text and images, and some very new ones that also process and produce…
-
A Clear Intro to MCP (Model Context Protocol) with Code Examples
A Clear Intro to MCP (Model Context Protocol) with Code Examples As the race to move AI agents from prototype to production heats up, the need for a standardized way for agents to call tools across different providers is pressing. This transition to a standardized approach to agent tool calling is similar to what we…
-
Communities in the Kuramoto Model: Dynamics and Detection via Path Signatures
Communities in the Kuramoto Model: Dynamics and Detection via Path Signatures arXiv:2503.17546v1 Announce Type: new Abstract: The behavior of multivariate dynamical processes is often governed by underlying structural connections that relate the components of the system. For example, brain activity which is often measured via time series is determined by an underlying structural graph, where…
-
A Statistical Theory of Contrastive Learning via Approximate Sufficient Statistics
A Statistical Theory of Contrastive Learning via Approximate Sufficient Statistics arXiv:2503.17538v1 Announce Type: new Abstract: Contrastive learning — a modern approach to extract useful representations from unlabeled data by training models to distinguish similar samples from dissimilar ones — has driven significant progress in foundation models. In this work, we develop a new theoretical framework…
-
Poisson-Process Topic Model for Integrating Knowledge from Pre-trained Language Models
Poisson-Process Topic Model for Integrating Knowledge from Pre-trained Language Models arXiv:2503.17809v1 Announce Type: new Abstract: Topic modeling is traditionally applied to word counts without accounting for the context in which words appear. Recent advancements in large language models (LLMs) offer contextualized word embeddings, which capture deeper meaning and relationships between words. We aim to leverage…
-
Understanding Inverse Reinforcement Learning under Overparameterization: Non-Asymptotic Analysis and Global Optimality
Understanding Inverse Reinforcement Learning under Overparameterization: Non-Asymptotic Analysis and Global Optimality arXiv:2503.17865v1 Announce Type: new Abstract: The goal of the Inverse reinforcement learning (IRL) task is to identify the underlying reward function and the corresponding optimal policy from a set of expert demonstrations. While most IRL algorithms’ theoretical guarantees rely on a linear reward structure,…
-
Quantile-Based Randomized Kaczmarz for Corrupted Tensor Linear Systems
Quantile-Based Randomized Kaczmarz for Corrupted Tensor Linear Systems arXiv:2503.18190v1 Announce Type: new Abstract: The reconstruction of tensor-valued signals from corrupted measurements, known as tensor regression, has become essential in many multi-modal applications such as hyperspectral image reconstruction and medical imaging. In this work, we address the tensor linear system problem $mathcal{A} mathcal{X}=mathcal{B}$, where $mathcal{A}$ is…
-
From Fuzzy to Precise: How a Morphological Feature Extractor Enhances AI’s Recognition Capabilities
From Fuzzy to Precise: How a Morphological Feature Extractor Enhances AI’s Recognition Capabilities Introduction: Can AI really distinguish dog breeds like human experts? One day while taking a walk, I saw a fluffy white puppy and wondered, Is that a Bichon Frise or a Maltese? No matter how closely I looked, they seemed almost identical.…
-
Build Your Own AI Coding Assistant in JupyterLab with Ollama and Hugging Face
Build Your Own AI Coding Assistant in JupyterLab with Ollama and Hugging Face Jupyter AI brings generative AI capabilities right into the Jupyter interface. Having a local AI assistant ensures privacy, reduces latency, and provides offline functionality, making it a powerful tool for developers. In this article, we’ll learn how to set up a local…
-
Procrustes Wasserstein Metric: A Modified Benamou-Brenier Approach with Applications to Latent Gaussian Distributions
Procrustes Wasserstein Metric: A Modified Benamou-Brenier Approach with Applications to Latent Gaussian Distributions arXiv:2503.16580v1 Announce Type: new Abstract: We introduce a modified Benamou-Brenier type approach leading to a Wasserstein type distance that allows global invariance, specifically, isometries, and we show that the problem can be summarized to orthogonal transformations. This distance is defined by penalizing…
-
EarlyStopping: Implicit Regularization for Iterative Learning Procedures in Python
EarlyStopping: Implicit Regularization for Iterative Learning Procedures in Python arXiv:2503.16753v1 Announce Type: new Abstract: Iterative learning procedures are ubiquitous in machine learning and modern statistics. Regularision is typically required to prevent inflating the expected loss of a procedure in later iterations via the propagation of noise inherent in the data. Significant emphasis has been placed…
-
Optimal Nonlinear Online Learning under Sequential Price Competition via s-Concavity
Optimal Nonlinear Online Learning under Sequential Price Competition via s-Concavity arXiv:2503.16737v1 Announce Type: new Abstract: We consider price competition among multiple sellers over a selling horizon of $T$ periods. In each period, sellers simultaneously offer their prices and subsequently observe their respective demand that is unobservable to competitors. The demand function for each seller depends…
-
Online Selective Conformal Prediction: Errors and Solutions
Online Selective Conformal Prediction: Errors and Solutions arXiv:2503.16809v1 Announce Type: new Abstract: In online selective conformal inference, data arrives sequentially, and prediction intervals are constructed only when an online selection rule is met. Since online selections may break the exchangeability between the selected test datum and the rest of the data, one must correct for…
-
Sparse Additive Contextual Bandits: A Nonparametric Approach for Online Decision-making with High-dimensional Covariates
Sparse Additive Contextual Bandits: A Nonparametric Approach for Online Decision-making with High-dimensional Covariates arXiv:2503.16941v1 Announce Type: new Abstract: Personalized services are central to today’s digital landscape, where online decision-making is commonly formulated as contextual bandit problems. Two key challenges emerge in modern applications: high-dimensional covariates and the need for nonparametric models to capture complex reward-covariate…
-
Weekly Entering & Transitioning – Thread 24 Mar, 2025 – 31 Mar, 2025
Weekly Entering & Transitioning – Thread 24 Mar, 2025 – 31 Mar, 2025 Welcome to this week’s entering & transitioning thread! This thread is for any questions about getting started, studying, or transitioning into the data science field. Topics include: Learning resources (e.g. books, tutorials, videos) Traditional education (e.g. schools, degrees, electives) Alternative education (e.g.…
-
What Germany Currently Is Up To, Debt-Wise
What Germany Currently Is Up To, Debt-Wise €1,600 per second. That’s how much interest Germany has to pay for its debts. In total, the German state has debts ranging into the trillions — more than a thousand billion Euros. And the government is planning to make even more, up to one trillion additional debt is…
-
Google’s Data Science Agent: Can It Really Do Your Job?
Google’s Data Science Agent: Can It Really Do Your Job? On March 3rd, Google officially rolled out its Data Science Agent to most Colab users for free. This is not something brand new — it was first announced in December last year, but it is now integrated into Colab and made widely accessible. Google says…
-
Hierarchical clustering with maximum density paths and mixture models
Hierarchical clustering with maximum density paths and mixture models arXiv:2503.15582v1 Announce Type: new Abstract: Hierarchical clustering is an effective and interpretable technique for analyzing structure in data, offering a nuanced understanding by revealing insights at multiple scales and resolutions. It is particularly helpful in settings where the exact number of clusters is unknown, and provides…
-
Interpretable Neural Causal Models with TRAM-DAGs
Interpretable Neural Causal Models with TRAM-DAGs arXiv:2503.16206v1 Announce Type: new Abstract: The ultimate goal of most scientific studies is to understand the underlying causal mechanism between the involved variables. Structural causal models (SCMs) are widely used to represent such causal mechanisms. Given an SCM, causal queries on all three levels of Pearl’s causal hierarchy can…
-
Tuning Sequential Monte Carlo Samplers via Greedy Incremental Divergence Minimization
Tuning Sequential Monte Carlo Samplers via Greedy Incremental Divergence Minimization arXiv:2503.15704v1 Announce Type: new Abstract: The performance of sequential Monte Carlo (SMC) samplers heavily depends on the tuning of the Markov kernels used in the path proposal. For SMC samplers with unadjusted Markov kernels, standard tuning objectives, such as the Metropolis-Hastings acceptance rate or the…
-
Sparse Nonparametric Contextual Bandits
Sparse Nonparametric Contextual Bandits arXiv:2503.16382v1 Announce Type: new Abstract: This paper studies the problem of simultaneously learning relevant features and minimising regret in contextual bandit problems. We introduce and analyse a new class of contextual bandit problems, called sparse nonparametric contextual bandits, in which the expected reward function lies in the linear span of a…
-
Data-Driven Approximation of Binary-State Network Reliability Function: Algorithm Selection and Reliability Thresholds for Large-Scale Systems
Data-Driven Approximation of Binary-State Network Reliability Function: Algorithm Selection and Reliability Thresholds for Large-Scale Systems arXiv:2503.15545v1 Announce Type: cross Abstract: Network reliability assessment is pivotal for ensuring the robustness of modern infrastructure systems, from power grids to communication networks. While exact reliability computation for binary-state networks is NP-hard, existing approximation methods face critical tradeoffs between…
-
R.E.D.: Scaling Text Classification with Expert Delegation
R.E.D.: Scaling Text Classification with Expert Delegation With the new age of problem-solving augmented by Large Language Models (LLMs), only a handful of problems remain that have subpar solutions. Most classification problems (at a PoC level) can be solved by leveraging LLMs at 70–90% Precision/F1 with just good prompt engineering techniques, as well as adaptive…
-
Algorithm Protection in the Context of Federated Learning
Algorithm Protection in the Context of Federated Learning While working at a biotech company, we aim to advance ML & AI Algorithms to enable, for example, brain lesion segmentation to be executed at the hospital/clinic location where patient data resides, so it is processed in a secure manner. This, in essence, is guaranteed by federated…
-
Mastering the Poisson Distribution: Intuition and Foundations
Mastering the Poisson Distribution: Intuition and Foundations You’ve probably used the normal distribution one or two times too many. We all have — It’s a true workhorse. But sometimes, we run into problems. For instance, when predicting or forecasting values, simulating data given a particular data-generating process, or when we try to visualise model output…
-
Six Organizational Models for Data Science
Six Organizational Models for Data Science Introduction Data science teams can operate in myriad ways within a company. These organizational models influence the type of work that the team does, but also the team’s culture, goals, Impact, and overall value to the company. Adopting the wrong organizational model can limit impact, cause delays, and compromise…
-
Variational Autoencoded Multivariate Spatial Fay-Herriot Models
Variational Autoencoded Multivariate Spatial Fay-Herriot Models arXiv:2503.14710v1 Announce Type: new Abstract: Small area estimation models are essential for estimating population characteristics in regions with limited sample sizes, thereby supporting policy decisions, demographic studies, and resource allocation, among other use cases. The spatial Fay-Herriot model is one such approach that incorporates spatial dependence to improve estimation…
-
The Hardness of Validating Observational Studies with Experimental Data
The Hardness of Validating Observational Studies with Experimental Data arXiv:2503.14795v1 Announce Type: new Abstract: Observational data is often readily available in large quantities, but can lead to biased causal effect estimates due to the presence of unobserved confounding. Recent works attempt to remove this bias by supplementing observational data with experimental data, which, when available,…
-
Interpretability of Graph Neural Networks to Assert Effects of Global Change Drivers on Ecological Networks
Interpretability of Graph Neural Networks to Assert Effects of Global Change Drivers on Ecological Networks arXiv:2503.15107v1 Announce Type: new Abstract: Pollinators play a crucial role for plant reproduction, either in natural ecosystem or in human-modified landscape. Global change drivers,including climate change or land use modifications, can alter the plant-pollinator interactions. To assert the potential influence…
-
Nonlinear Bayesian Update via Ensemble Kernel Regression with Clustering and Subsampling
Nonlinear Bayesian Update via Ensemble Kernel Regression with Clustering and Subsampling arXiv:2503.15160v1 Announce Type: new Abstract: Nonlinear Bayesian update for a prior ensemble is proposed to extend traditional ensemble Kalman filtering to settings characterized by non-Gaussian priors and nonlinear measurement operators. In this framework, the observed component is first denoised via a standard Kalman update,…
-
Online federated learning framework for classification
Online federated learning framework for classification arXiv:2503.15210v1 Announce Type: new Abstract: In this paper, we develop a novel online federated learning framework for classification, designed to handle streaming data from multiple clients while ensuring data privacy and computational efficiency. Our method leverages the generalized distance-weighted discriminant technique, making it robust to both homogeneous and heterogeneous…
-
Positivity sets of hinge functions
Positivity sets of hinge functions arXiv:2503.13512v1 Announce Type: new Abstract: In this paper we investigate which subsets of the real plane are realisable as the set of points on which a one-layer ReLU neural network takes a positive value. In the case of cones we give a full characterisation of such sets. Furthermore, we give…
-
Micro Text Classification Based on Balanced Positive-Unlabeled Learning
Micro Text Classification Based on Balanced Positive-Unlabeled Learning arXiv:2503.13562v1 Announce Type: new Abstract: In real-world text classification tasks, negative texts often contain a minimal proportion of negative content, which is especially problematic in areas like text quality control, legal risk screening, and sensitive information interception. This challenge manifests at two levels: at the macro level,…
-
Bayesian Kernel Regression for Functional Data
Bayesian Kernel Regression for Functional Data arXiv:2503.13676v1 Announce Type: new Abstract: In supervised learning, the output variable to be predicted is often represented as a function, such as a spectrum or probability distribution. Despite its importance, functional output regression remains relatively unexplored. In this study, we propose a novel functional output regression model based on…
-
ROCK: A variational formulation for occupation kernel methods in Reproducing Kernel Hilbert Spaces
ROCK: A variational formulation for occupation kernel methods in Reproducing Kernel Hilbert Spaces arXiv:2503.13791v1 Announce Type: new Abstract: We present a Representer Theorem result for a large class of weak formulation problems. We provide examples of applications of our formulation both in traditional machine learning and numerical methods as well as in new and emerging…
-
Ranking and Selection with Simultaneous Input Data Collection
Ranking and Selection with Simultaneous Input Data Collection arXiv:2503.11773v1 Announce Type: new Abstract: In this paper, we propose a general and novel formulation of ranking and selection with the existence of streaming input data. The collection of multiple streams of such data may consume different types of resources, and hence can be conducted simultaneously. To…
-
Bayes and Biased Estimators Without Hyper-parameter Estimation: Comparable Performance to the Empirical-Bayes-Based Regularized Estimator
Bayes and Biased Estimators Without Hyper-parameter Estimation: Comparable Performance to the Empirical-Bayes-Based Regularized Estimator arXiv:2503.11854v1 Announce Type: new Abstract: Regularized system identification has become a significant complement to more classical system identification. It has been numerically shown that kernel-based regularized estimators often perform better than the maximum likelihood estimator in terms of minimizing mean squared…