Category: aimldsaimlds

Radon–Wasserstein Gradient Flows for Interacting-Particle Sampling in High Dimensions

Radon–Wasserstein Gradient Flows for Interacting-Particle Sampling in High Dimensions arXiv:2602.05227v1 Announce Type: new Abstract: Gradient flows of the Kullback–Leibler (KL) divergence, such as the Fokker–Planck equation and Stein Variational Gradient Descent, evolve a distribution toward a target density known only up to a normalizing constant. We introduce new gradient flows of the KL divergence with…

February 6, 2026
Decision-Focused Sequential Experimental Design: A Directional Uncertainty-Guided Approach

Decision-Focused Sequential Experimental Design: A Directional Uncertainty-Guided Approach arXiv:2602.05340v1 Announce Type: new Abstract: We consider the sequential experimental design problem in the predict-then-optimize paradigm. In this paradigm, the outputs of the prediction model are used as coefficient vectors in a downstream linear optimization problem. Traditional sequential experimental design aims to control the input variables (features)…

February 6, 2026
Mechanistic Interpretability: Peeking Inside an LLM

Mechanistic Interpretability: Peeking Inside an LLM Are the human-like cognitive abilities of LLMs real or fake? How does information travel through the neural network? Is there hidden knowledge inside an LLM? The post Mechanistic Interpretability: Peeking Inside an LLM appeared first on Towards Data Science. Julian Mendel Go to original source

February 6, 2026
Why Is My Code So Slow? A Guide to Py-Spy Python Profiling

Why Is My Code So Slow? A Guide to Py-Spy Python Profiling Stop guessing and start diagnosing performance issues using Py-Spy The post Why Is My Code So Slow? A Guide to Py-Spy Python Profiling appeared first on Towards Data Science. Kenneth McCarthy Go to original source

February 6, 2026
The Rule Everyone Misses: How to Stop Confusing loc and iloc in Pandas

The Rule Everyone Misses: How to Stop Confusing loc and iloc in Pandas A simple mental model to remember when each one works (with examples that finally click). The post The Rule Everyone Misses: How to Stop Confusing loc and iloc in Pandas appeared first on Towards Data Science. Ibrahim Salami Go to original source

February 6, 2026
A Hitchhiker’s Guide to Poisson Gradient Estimation

A Hitchhiker’s Guide to Poisson Gradient Estimation arXiv:2602.03896v1 Announce Type: new Abstract: Poisson-distributed latent variable models are widely used in computational neuroscience, but differentiating through discrete stochastic samples remains challenging. Two approaches address this: Exponential Arrival Time (EAT) simulation and Gumbel-SoftMax (GSM) relaxation. We provide the first systematic comparison of these methods, along with practical…

February 5, 2026
Transcendental Regularization of Finite Mixtures:Theoretical Guarantees and Practical Limitations

Transcendental Regularization of Finite Mixtures:Theoretical Guarantees and Practical Limitations arXiv:2602.03889v1 Announce Type: new Abstract: Finite mixture models are widely used for unsupervised learning, but maximum likelihood estimation via EM suffers from degeneracy as components collapse. We introduce transcendental regularization, a penalized likelihood framework with analytic barrier functions that prevent degeneracy while maintaining asymptotic efficiency. The…

February 5, 2026
Byzantine Machine Learning: MultiKrum and an optimal notion of robustness

Byzantine Machine Learning: MultiKrum and an optimal notion of robustness arXiv:2602.03899v1 Announce Type: new Abstract: Aggregation rules are the cornerstone of distributed (or federated) learning in the presence of adversaries, under the so-called Byzantine threat model. They are also interesting mathematical objects from the point of view of robust mean estimation. The Krum aggregation rule…

February 5, 2026
Learning Multi-type heterogeneous interacting particle systems

Learning Multi-type heterogeneous interacting particle systems arXiv:2602.03954v1 Announce Type: new Abstract: We propose a framework for the joint inference of network topology, multi-type interaction kernels, and latent type assignments in heterogeneous interacting particle systems from multi-trajectory data. This learning task is a challenging non-convex mixed-integer optimization problem, which we address through a novel three-stage approach.…

February 5, 2026
Privacy utility trade offs for parameter estimation in degree heterogeneous higher order networks

Privacy utility trade offs for parameter estimation in degree heterogeneous higher order networks arXiv:2602.03948v1 Announce Type: new Abstract: In sensitive applications involving relational datasets, protecting information about individual links from adversarial queries is of paramount importance. In many such settings, the available data are summarized solely through the degrees of the nodes in the network.…

February 5, 2026
How to Work Effectively with Frontend and Backend Code

How to Work Effectively with Frontend and Backend Code Learn how to be an effective full-stack engineer with Claude Code The post How to Work Effectively with Frontend and Backend Code appeared first on Towards Data Science. Eivind Kjosbakken Go to original source

February 5, 2026
AWS vs. Azure: A Deep Dive into Model Training – Part 2

AWS vs. Azure: A Deep Dive into Model Training – Part 2 This article covers how Azure ML’s persistent, workspace-centric compute resources differ from AWS SageMaker’s on-demand, job-specific approach. Additionally, we explored environment customization options, from Azure’s curated environments and custom environments to SageMaker’s three level of customizations. The post AWS vs. Azure: A Deep…

February 5, 2026
How to Build Your Own Custom LLM Memory Layer from Scratch

How to Build Your Own Custom LLM Memory Layer from Scratch Step-by-step guide to building autonomous memory retrieval systems The post How to Build Your Own Custom LLM Memory Layer from Scratch appeared first on Towards Data Science. Avishek Biswas Go to original source

February 5, 2026
Plan–Code–Execute: Designing Agents That Create Their Own Tools

Plan–Code–Execute: Designing Agents That Create Their Own Tools The case against pre-built tools in Agentic Architectures The post Plan–Code–Execute: Designing Agents That Create Their Own Tools appeared first on Towards Data Science. Partha Sarkar Go to original source

February 5, 2026
Rethinking Test-Time Training: Tilting The Latent Distribution For Few-Shot Source-Free Adaptation

Rethinking Test-Time Training: Tilting The Latent Distribution For Few-Shot Source-Free Adaptation arXiv:2602.02633v1 Announce Type: new Abstract: Often, constraints arise in deployment settings where even lightweight parameter updates e.g. parameter-efficient fine-tuning could induce model shift or tuning instability. We study test-time adaptation of foundation models for few-shot classification under a completely frozen-model regime, where additionally, no…

February 4, 2026
Relaxed Triangle Inequality for Kullback-Leibler Divergence Between Multivariate Gaussian Distributions

Relaxed Triangle Inequality for Kullback-Leibler Divergence Between Multivariate Gaussian Distributions arXiv:2602.02577v1 Announce Type: new Abstract: The Kullback-Leibler (KL) divergence is not a proper distance metric and does not satisfy the triangle inequality, posing theoretical challenges in certain practical applications. Existing work has demonstrated that KL divergence between multivariate Gaussian distributions follows a relaxed triangle inequality.…

February 4, 2026
Near-Universal Multiplicative Updates for Nonnegative Einsum Factorization

Near-Universal Multiplicative Updates for Nonnegative Einsum Factorization arXiv:2602.02759v1 Announce Type: new Abstract: Despite the ubiquity of multiway data across scientific domains, there are few user-friendly tools that fit tailored nonnegative tensor factorizations. Researchers may use gradient-based automatic differentiation (which often struggles in nonnegative settings), choose between a limited set of methods with mature implementations, or…

February 4, 2026
Training-Free Self-Correction for Multimodal Masked Diffusion Models

Training-Free Self-Correction for Multimodal Masked Diffusion Models arXiv:2602.02927v1 Announce Type: new Abstract: Masked diffusion models have emerged as a powerful framework for text and multimodal generation. However, their sampling procedure updates multiple tokens simultaneously and treats generated tokens as immutable, which may lead to error accumulation when early mistakes cannot be revised. In this work,…

February 4, 2026
Plug-In Classification of Drift Functions in Diffusion Processes Using Neural Networks

Plug-In Classification of Drift Functions in Diffusion Processes Using Neural Networks arXiv:2602.02791v1 Announce Type: new Abstract: We study a supervised multiclass classification problem for diffusion processes, where each class is characterized by a distinct drift function and trajectories are observed at discrete times. Extending the one-dimensional multiclass framework of Denis et al. (2024) to multidimensional…

February 4, 2026
Routing in a Sparse Graph: a Distributed Q-Learning Approach

Routing in a Sparse Graph: a Distributed Q-Learning Approach Distributed agents need only decide one move ahead. The post Routing in a Sparse Graph: a Distributed Q-Learning Approach appeared first on Towards Data Science. Sébastien Gilbert Go to original source

February 4, 2026
YOLOv2 & YOLO9000 Paper Walkthrough: Better, Faster, Stronger

YOLOv2 & YOLO9000 Paper Walkthrough: Better, Faster, Stronger From YOLOv1 to YOLOv2: prior box, k-means, Darknet-19, passthrough layer, and more The post YOLOv2 & YOLO9000 Paper Walkthrough: Better, Faster, Stronger appeared first on Towards Data Science. Muhammad Ardi Go to original source

February 4, 2026
Creating a Data Pipeline to Monitor Local Crime Trends

Creating a Data Pipeline to Monitor Local Crime Trends A walkthough of creating an ETL pipeline to extract local crime data and visualize it in Metabase. The post Creating a Data Pipeline to Monitor Local Crime Trends appeared first on Towards Data Science. Jimin Kang Go to original source

February 4, 2026
The Proximity of the Inception Score as an Evaluation Criterion

The Proximity of the Inception Score as an Evaluation Criterion The neighborhood of synthetic data The post The Proximity of the Inception Score as an Evaluation Criterion appeared first on Towards Data Science. Giuseppe Pio Cannata Go to original source

February 4, 2026
Neuron Block Dynamics for XOR Classification with Zero-Margin

Neuron Block Dynamics for XOR Classification with Zero-Margin arXiv:2602.00172v1 Announce Type: new Abstract: The ability of neural networks to learn useful features through stochastic gradient descent (SGD) is a cornerstone of their success. Most theoretical analyses focus on regression or on classification tasks with a positive margin, where worst-case gradient bounds suffice. In contrast, we…

February 3, 2026
Uncertainty-Aware Multimodal Learning via Conformal Shapley Intervals

Uncertainty-Aware Multimodal Learning via Conformal Shapley Intervals arXiv:2602.00171v1 Announce Type: new Abstract: Multimodal learning combines information from multiple data modalities to improve predictive performance. However, modalities often contribute unequally and in a data dependent way, making it unclear which data modalities are genuinely informative and to what extent their contributions can be trusted. Quantifying modality…

February 3, 2026
Singular Bayesian Neural Networks

Singular Bayesian Neural Networks arXiv:2602.00387v1 Announce Type: new Abstract: Bayesian neural networks promise calibrated uncertainty but require $O(mn)$ parameters for standard mean-field Gaussian posteriors. We argue this cost is often unnecessary, particularly when weight matrices exhibit fast singular value decay. By parameterizing weights as $W = AB^{top}$ with $A in mathbb{R}^{m times r}$, $B in…

February 3, 2026
Alignment of Diffusion Model and Flow Matching for Text-to-Image Generation

Alignment of Diffusion Model and Flow Matching for Text-to-Image Generation arXiv:2602.00413v1 Announce Type: new Abstract: Diffusion models and flow matching have demonstrated remarkable success in text-to-image generation. While many existing alignment methods primarily focus on fine-tuning pre-trained generative models to maximize a given reward function, these approaches require extensive computational resources and may not generalize…

February 3, 2026
Reinforcement Learning for Control Systems with Time Delays: A Comprehensive Survey

Reinforcement Learning for Control Systems with Time Delays: A Comprehensive Survey arXiv:2602.00399v1 Announce Type: new Abstract: In the last decade, Reinforcement Learning (RL) has achieved remarkable success in the control and decision-making of complex dynamical systems. However, most RL algorithms rely on the Markov Decision Process assumption, which is violated in practical cyber-physical systems affected…

February 3, 2026
Building Systems That Survive Real Life

Building Systems That Survive Real Life Sara Nobrega on the transition from data science to AI engineering, using LLMs as a bridge to DevOps, and the one engineering skill junior data scientists need to stay competitive. The post Building Systems That Survive Real Life appeared first on Towards Data Science. TDS Editors Go to original…

February 3, 2026
Silicon Darwinism: Why Scarcity Is the Source of True Intelligence

Silicon Darwinism: Why Scarcity Is the Source of True Intelligence We are confusing “size” with “smart.” The next leap in artificial intelligence will not come from a larger data center, but from a more constrained environment. The post Silicon Darwinism: Why Scarcity Is the Source of True Intelligence appeared first on Towards Data Science. Aakash…

February 3, 2026
Amortized Simulation-Based Inference in Generalized Bayes via Neural Posterior Estimation

Amortized Simulation-Based Inference in Generalized Bayes via Neural Posterior Estimation arXiv:2601.22367v1 Announce Type: new Abstract: Generalized Bayesian Inference (GBI) tempers a loss with a temperature $beta>0$ to mitigate overconfidence and improve robustness under model misspecification, but existing GBI methods typically rely on costly MCMC or SDE-based samplers and must be re-run for each new dataset…

February 2, 2026
Dependence-Aware Label Aggregation for LLM-as-a-Judge via Ising Models

Dependence-Aware Label Aggregation for LLM-as-a-Judge via Ising Models arXiv:2601.22336v1 Announce Type: new Abstract: Large-scale AI evaluation increasingly relies on aggregating binary judgments from $K$ annotators, including LLMs used as judges. Most classical methods, e.g., Dawid-Skene or (weighted) majority voting, assume annotators are conditionally independent given the true label $Yin{0,1}$, an assumption often violated by LLM…

February 2, 2026
It’s all the (Exponential) Family: An Equivalence between Maximum Likelihood Estimation and Control Variates for Sketching Algorithms

It’s all the (Exponential) Family: An Equivalence between Maximum Likelihood Estimation and Control Variates for Sketching Algorithms arXiv:2601.22378v1 Announce Type: new Abstract: Maximum likelihood estimators (MLE) and control variate estimators (CVE) have been used in conjunction with known information across sketching algorithms and applications in machine learning. We prove that under certain conditions in an…

February 2, 2026
Simulation-based Bayesian inference with ameliorative learned summary statistics — Part I

Simulation-based Bayesian inference with ameliorative learned summary statistics — Part I arXiv:2601.22441v1 Announce Type: new Abstract: This paper, which is Part 1 of a two-part paper series, considers a simulation-based inference with learned summary statistics, in which such a learned summary statistic serves as an empirical-likelihood with ameliorative effects in the Bayesian setting, when the…

February 2, 2026
Corrected Samplers for Discrete Flow Models

Corrected Samplers for Discrete Flow Models arXiv:2601.22519v1 Announce Type: new Abstract: Discrete flow models (DFMs) have been proposed to learn the data distribution on a finite state space, offering a flexible framework as an alternative to discrete diffusion models. A line of recent work has studied samplers for discrete diffusion models, such as tau-leaping and…

February 2, 2026
Weekly Entering & Transitioning – Thread 02 Feb, 2026 – 09 Feb, 2026

Weekly Entering & Transitioning – Thread 02 Feb, 2026 – 09 Feb, 2026 Welcome to this week’s entering & transitioning thread! This thread is for any questions about getting started, studying, or transitioning into the data science field. Topics include: Learning resources (e.g. books, tutorials, videos) Traditional education (e.g. schools, degrees, electives) Alternative education (e.g.…

February 2, 2026
Am I drifting away from Data Science, or building useful foundations? (2 YOE working in a startup, no coding)

Am I drifting away from Data Science, or building useful foundations? (2 YOE working in a startup, no coding) I’m looking for some career perspective and would really appreciate advice from people working in or around data science. I’m currently not sure where exactly is my career heading and want to start a business eventually…

February 2, 2026
Brainstorming around the visualization of customer segment data

Brainstorming around the visualization of customer segment data submitted by /u/SingerEast1469 [link] [comments] /u/SingerEast1469 Go to original source

February 2, 2026
What separates data scientists who earn a good living (100k-200k) from those who earn 300k+ at FAANG?

What separates data scientists who earn a good living (100k-200k) from those who earn 300k+ at FAANG? Is it just stock options and vesting? Or is it just FAANG is a lot of work. Why do some data scientists deserve that much? I work at a Fortune 500 and the ceiling for IC data scientists…

February 2, 2026
Building “Auto-Analyst” — A data analytics AI agentic system

Building “Auto-Analyst” — A data analytics AI agentic system submitted by /u/phicreative1997 [link] [comments] /u/phicreative1997 Go to original source

February 2, 2026
Distributed Reinforcement Learning for Scalable High-Performance Policy Optimization

Distributed Reinforcement Learning for Scalable High-Performance Policy Optimization Leveraging massive parallelism, asynchronous updates, and multi-machine training to match and exceed human-level performance The post Distributed Reinforcement Learning for Scalable High-Performance Policy Optimization appeared first on Towards Data Science. Sam Black Go to original source

February 2, 2026
How to Apply Agentic Coding to Solve Problems

How to Apply Agentic Coding to Solve Problems Learn how to efficiently solve problems with coding agents The post How to Apply Agentic Coding to Solve Problems appeared first on Towards Data Science. Eivind Kjosbakken Go to original source

February 1, 2026
How to Run Claude Code for Free with Local and Cloud Models from Ollama

How to Run Claude Code for Free with Local and Cloud Models from Ollama Ollama now offers Anthropic API compatibility The post How to Run Claude Code for Free with Local and Cloud Models from Ollama appeared first on Towards Data Science. Thomas Reid Go to original source

February 1, 2026
Creating an Etch A Sketch App Using Python and Turtle

Creating an Etch A Sketch App Using Python and Turtle A beginner-friendly Python tutorial The post Creating an Etch A Sketch App Using Python and Turtle appeared first on Towards Data Science. Mahnoor Javed Go to original source

January 31, 2026
Why Your Multi-Agent System is Failing: Escaping the 17x Error Trap of the “Bag of Agents”

Why Your Multi-Agent System is Failing: Escaping the 17x Error Trap of the “Bag of Agents” Hard-won lessons on how to scale agentic systems without scaling the chaos, including a taxonomy of core agent types. The post Why Your Multi-Agent System is Failing: Escaping the 17x Error Trap of the “Bag of Agents” appeared first…

January 31, 2026
On the Possibility of Small Networks for Physics-Informed Learning

On the Possibility of Small Networks for Physics-Informed Learning A new kind of hyperparameter study The post On the Possibility of Small Networks for Physics-Informed Learning appeared first on Towards Data Science. Conor Rowan Go to original source

January 31, 2026
Multi-Attribute Decision Matrices, Done Right

Multi-Attribute Decision Matrices, Done Right How to structure decisions, identify efficient options, and avoid misleading value metrics The post Multi-Attribute Decision Matrices, Done Right appeared first on Towards Data Science. Josiah DeValois Go to original source

January 31, 2026
TDS Newsletter: January Must-Reads on Data Platforms, Infinite Context, and More

TDS Newsletter: January Must-Reads on Data Platforms, Infinite Context, and More Don’t miss our most-read and -shared stories of the past month The post TDS Newsletter: January Must-Reads on Data Platforms, Infinite Context, and More appeared first on Towards Data Science. TDS Editors Go to original source

January 31, 2026
Latent-IMH: Efficient Bayesian Inference for Inverse Problems with Approximate Operators

Latent-IMH: Efficient Bayesian Inference for Inverse Problems with Approximate Operators arXiv:2601.20888v1 Announce Type: new Abstract: We study sampling from posterior distributions in Bayesian linear inverse problems where $A$, the parameters to observables operator, is computationally expensive. In many applications, $A$ can be factored in a manner that facilitates the construction of a cost-effective approximation $tilde{A}$.…

January 30, 2026
Efficient Causal Structure Learning via Modular Subgraph Integration

Efficient Causal Structure Learning via Modular Subgraph Integration arXiv:2601.21014v1 Announce Type: new Abstract: Learning causal structures from observational data remains a fundamental yet computationally intensive task, particularly in high-dimensional settings where existing methods face challenges such as the super-exponential growth of the search space and increasing computational demands. To address this, we introduce VISTA (Voting-based…

January 30, 2026
A Diffusive Classification Loss for Learning Energy-based Generative Models

A Diffusive Classification Loss for Learning Energy-based Generative Models arXiv:2601.21025v1 Announce Type: new Abstract: Score-based generative models have recently achieved remarkable success. While they are usually parameterized by the score, an alternative way is to use a series of time-dependent energy-based models (EBMs), where the score is obtained from the negative input-gradient of the energy.…

January 30, 2026
Diffusion-based Annealed Boltzmann Generators : benefits, pitfalls and hopes

Diffusion-based Annealed Boltzmann Generators : benefits, pitfalls and hopes arXiv:2601.21026v1 Announce Type: new Abstract: Sampling configurations at thermodynamic equilibrium is a central challenge in statistical physics. Boltzmann Generators (BGs) tackle it by combining a generative model with a Monte Carlo (MC) correction step to obtain asymptotically unbiased samples from an unnormalized target. Most current BGs…

January 30, 2026
An efficient, accurate, and interpretable machine learning method for computing probability of failure

An efficient, accurate, and interpretable machine learning method for computing probability of failure arXiv:2601.21089v1 Announce Type: new Abstract: We introduce a novel machine learning method called the Penalized Profile Support Vector Machine based on the Gabriel edited set for the computation of the probability of failure for a complex system as determined by a threshold…

January 30, 2026
Optimizing Vector Search: Why You Should Flatten Structured Data

Optimizing Vector Search: Why You Should Flatten Structured Data An analysis of how flattening structured data can boost precision and recall by up to 20% The post Optimizing Vector Search: Why You Should Flatten Structured Data appeared first on Towards Data Science. Oleg Tereshin Go to original source

January 30, 2026
RoPE, Clearly Explained

RoPE, Clearly Explained Going beyond the math to build intuition The post RoPE, Clearly Explained appeared first on Towards Data Science. Lorenzo Cesconetto Go to original source

January 30, 2026
The Unbearable Lightness of Coding

The Unbearable Lightness of Coding Confessions of a vibe coder The post The Unbearable Lightness of Coding appeared first on Towards Data Science. Elena Jolkver Go to original source

January 30, 2026
Randomization Works in Experiments, Even Without Balance

Randomization Works in Experiments, Even Without Balance Randomization usually balances confounders in experiments, but what happens when it doesn’t? The post Randomization Works in Experiments, Even Without Balance appeared first on Towards Data Science. Jarom Hulet Go to original source

January 30, 2026
Deep Neural Networks as Iterated Function Systems and a Generalization Bound

Deep Neural Networks as Iterated Function Systems and a Generalization Bound arXiv:2601.19958v1 Announce Type: new Abstract: Deep neural networks (DNNs) achieve remarkable performance on a wide range of tasks, yet their mathematical analysis remains fragmented: stability and generalization are typically studied in disparate frameworks and on a case-by-case basis. Architecturally, DNNs rely on the recursive…

January 29, 2026
Minimax Rates for Hyperbolic Hierarchical Learning

Minimax Rates for Hyperbolic Hierarchical Learning arXiv:2601.20047v1 Announce Type: new Abstract: We prove an exponential separation in sample complexity between Euclidean and hyperbolic representations for learning on hierarchical data under standard Lipschitz regularization. For depth-$R$ hierarchies with branching factor $m$, we first establish a geometric obstruction for Euclidean space: any bounded-radius embedding forces volumetric collapse,…

January 29, 2026
Efficient Evaluation of LLM Performance with Statistical Guarantees

Efficient Evaluation of LLM Performance with Statistical Guarantees arXiv:2601.20251v1 Announce Type: new Abstract: Exhaustively evaluating many large language models (LLMs) on a large suite of benchmarks is expensive. We cast benchmarking as finite-population inference and, under a fixed query budget, seek tight confidence intervals (CIs) for model accuracy with valid frequentist coverage. We propose Factorized…

January 29, 2026
Empirical Likelihood-Based Fairness Auditing: Distribution-Free Certification and Flagging

Empirical Likelihood-Based Fairness Auditing: Distribution-Free Certification and Flagging arXiv:2601.20269v1 Announce Type: new Abstract: Machine learning models in high-stakes applications, such as recidivism prediction and automated personnel selection, often exhibit systematic performance disparities across sensitive subpopulations, raising critical concerns regarding algorithmic bias. Fairness auditing addresses these risks through two primary functions: certification, which verifies adherence to…

January 29, 2026
Physics-informed Blind Reconstruction of Dense Fields from Sparse Measurements using Neural Networks with a Differentiable Simulator

Physics-informed Blind Reconstruction of Dense Fields from Sparse Measurements using Neural Networks with a Differentiable Simulator arXiv:2601.20496v1 Announce Type: new Abstract: Generating dense physical fields from sparse measurements is a fundamental question in sampling, signal processing, and many other applications. State-of-the-art methods either use spatial statistics or rely on examples of dense fields in the…

January 29, 2026
Federated Learning, Part 2: Implementation with the Flower Framework 🌼

Federated Learning, Part 2: Implementation with the Flower Framework 🌼 Implementing cross-silo federated learning step by step The post Federated Learning, Part 2: Implementation with the Flower Framework 🌼 appeared first on Towards Data Science. Parul Pandey Go to original source

January 29, 2026
Machine Learning in Production? What This Really Means

Machine Learning in Production? What This Really Means From notebooks to real-world systems The post Machine Learning in Production? What This Really Means appeared first on Towards Data Science. Sabrine Bendimerad Go to original source

January 29, 2026
I Ditched My Mouse: How I Control My Computer With Hand Gestures (In 60 Lines of Python)

I Ditched My Mouse: How I Control My Computer With Hand Gestures (In 60 Lines of Python) A step-by-step guide to building a “Minority Report”-style interface using OpenCV and MediaPipe The post I Ditched My Mouse: How I Control My Computer With Hand Gestures (In 60 Lines of Python) appeared first on Towards Data Science.…

January 29, 2026
Modeling Urban Walking Risk Using Spatial-Temporal Machine Learning

Modeling Urban Walking Risk Using Spatial-Temporal Machine Learning Estimating neighborhood-level pedestrian risk from real-world incident data The post Modeling Urban Walking Risk Using Spatial-Temporal Machine Learning appeared first on Towards Data Science. Aneesh Patil Go to original source

January 29, 2026
Statistical Inference for Explainable Boosting Machines

Statistical Inference for Explainable Boosting Machines arXiv:2601.18857v1 Announce Type: new Abstract: Explainable boosting machines (EBMs) are popular “glass-box” models that learn a set of univariate functions using boosting trees. These achieve explainability through visualizations of each feature’s effect. However, unlike linear model coefficients, uncertainty quantification for the learned univariate functions requires computationally intensive bootstrapping, making…

January 28, 2026
Implicit Q-Learning and SARSA: Liberating Policy Control from Step-Size Calibration

Implicit Q-Learning and SARSA: Liberating Policy Control from Step-Size Calibration arXiv:2601.18907v1 Announce Type: new Abstract: Q-learning and SARSA are foundational reinforcement learning algorithms whose practical success depends critically on step-size calibration. Step-sizes that are too large can cause numerical instability, while step-sizes that are too small can lead to slow progress. We propose implicit variants…

January 28, 2026
Collaborative Compressors in Distributed Mean Estimation with Limited Communication Budget

Collaborative Compressors in Distributed Mean Estimation with Limited Communication Budget arXiv:2601.18950v1 Announce Type: new Abstract: Distributed high dimensional mean estimation is a common aggregation routine used often in distributed optimization methods. Most of these applications call for a communication-constrained setting where vectors, whose mean is to be estimated, have to be compressed before sharing. One…

January 28, 2026
Convergence of Muon with Newton-Schulz

Convergence of Muon with Newton-Schulz arXiv:2601.19156v1 Announce Type: new Abstract: We analyze Muon as originally proposed and used in practice — using the momentum orthogonalization with a few Newton-Schulz steps. The prior theoretical results replace this key step in Muon with an exact SVD-based polar factor. We prove that Muon with Newton-Schulz converges to a…

January 28, 2026
Double Fairness Policy Learning: Integrating Action Fairness and Outcome Fairness in Decision-making

Double Fairness Policy Learning: Integrating Action Fairness and Outcome Fairness in Decision-making arXiv:2601.19186v1 Announce Type: new Abstract: Fairness is a central pillar of trustworthy machine learning, especially in domains where accuracy- or profit-driven optimization is insufficient. While most fairness research focuses on supervised learning, fairness in policy learning remains less explored. Because policy learning is…

January 28, 2026
Going Beyond the Context Window: Recursive Language Models in Action

Going Beyond the Context Window: Recursive Language Models in Action Explore a practical approach to analysing massive datasets with LLMs The post Going Beyond the Context Window: Recursive Language Models in Action appeared first on Towards Data Science. Mariya Mansurova Go to original source

January 28, 2026
Data Science as Engineering: Foundations, Education, and Professional Identity

Data Science as Engineering: Foundations, Education, and Professional Identity Recognize data science as an engineering practice and structure education accordingly. The post Data Science as Engineering: Foundations, Education, and Professional Identity appeared first on Towards Data Science. Tom Narock Go to original source

January 28, 2026
From Connections to Meaning: Why Heterogeneous Graph Transformers (HGT) Change Demand Forecasting

From Connections to Meaning: Why Heterogeneous Graph Transformers (HGT) Change Demand Forecasting How relationship-aware graphs turn connected forecasts into operational insight The post From Connections to Meaning: Why Heterogeneous Graph Transformers (HGT) Change Demand Forecasting appeared first on Towards Data Science. Partha Sarkar Go to original source

January 28, 2026
Layered Architecture for Building Readable, Robust, and Extensible Apps

Layered Architecture for Building Readable, Robust, and Extensible Apps If adding a feature feels like open-heart surgery on your codebase, the problem isn’t bugs, it’s structure. This article shows how better architecture reduces risk, speeds up change, and keeps teams moving. The post Layered Architecture for Building Readable, Robust, and Extensible Apps appeared first on…

January 28, 2026
Data-Driven Information-Theoretic Causal Bounds under Unmeasured Confounding

Data-Driven Information-Theoretic Causal Bounds under Unmeasured Confounding arXiv:2601.17160v1 Announce Type: new Abstract: We develop a data-driven information-theoretic framework for sharp partial identification of causal effects under unmeasured confounding. Existing approaches often rely on restrictive assumptions, such as bounded or discrete outcomes; require external inputs (for example, instrumental variables, proxies, or user-specified sensitivity parameters); necessitate full…

January 27, 2026
Error Analysis of Bayesian Inverse Problems with Generative Priors

Error Analysis of Bayesian Inverse Problems with Generative Priors arXiv:2601.17374v1 Announce Type: new Abstract: Data-driven methods for the solution of inverse problems have become widely popular in recent years thanks to the rise of machine learning techniques. A popular approach concerns the training of a generative model on additional data to learn a bespoke prior…

January 27, 2026
“Rebuilding” Statistics in the Age of AI: A Town Hall Discussion on Culture, Infrastructure, and Training

“Rebuilding” Statistics in the Age of AI: A Town Hall Discussion on Culture, Infrastructure, and Training arXiv:2601.17510v1 Announce Type: new Abstract: This article presents the full, original record of the 2024 Joint Statistical Meetings (JSM) town hall, “Statistics in the Age of AI,” which convened leading statisticians to discuss how the field is evolving in…

January 27, 2026
Boosting methods for interval-censored data with regression and classification

Boosting methods for interval-censored data with regression and classification arXiv:2601.17973v1 Announce Type: new Abstract: Boosting has garnered significant interest across both machine learning and statistical communities. Traditional boosting algorithms, designed for fully observed random samples, often struggle with real-world problems, particularly with interval-censored data. This type of data is common in survival analysis and time-to-event…

January 27, 2026
A Cherry-Picking Approach to Large Load Shaping for More Effective Carbon Reduction

A Cherry-Picking Approach to Large Load Shaping for More Effective Carbon Reduction arXiv:2601.17990v1 Announce Type: new Abstract: Shaping multi-megawatt loads, such as data centers, impacts generator dispatch on the electric grid, which in turn affects system CO2 emissions and energy cost. Substantiating the effectiveness of prevalent load shaping strategies, such as those based on grid-level…

January 27, 2026
How Cursor Actually Indexes Your Codebase

How Cursor Actually Indexes Your Codebase Exploring the RAG pipeline in Cursor that powers code indexing and retrieval for coding agents The post How Cursor Actually Indexes Your Codebase appeared first on Towards Data Science. Kenneth Leung Go to original source

January 27, 2026
Ray: Distributed Computing For All, Part 2

Ray: Distributed Computing For All, Part 2 Deploying and running Python code on cloud-based clusters The post Ray: Distributed Computing For All, Part 2 appeared first on Towards Data Science. Thomas Reid Go to original source

January 27, 2026
How Convolutional Neural Networks Learn Musical Similarity

How Convolutional Neural Networks Learn Musical Similarity Learning audio embeddings with contrastive learning and deploying them in a real music recommendation app The post How Convolutional Neural Networks Learn Musical Similarity appeared first on Towards Data Science. Luke Stuckey Go to original source

January 27, 2026
Causal ML for the Aspiring Data Scientist

Causal ML for the Aspiring Data Scientist An accessible introduction to causal inference and ML The post Causal ML for the Aspiring Data Scientist appeared first on Towards Data Science. Ross Lauterbach Go to original source

January 27, 2026
Distributional Computational Graphs: Error Bounds

Distributional Computational Graphs: Error Bounds arXiv:2601.16250v1 Announce Type: new Abstract: We study a general framework of distributional computational graphs: computational graphs whose inputs are probability distributions rather than point values. We analyze the discretization error that arises when these graphs are evaluated using finite approximations of continuous probability distributions. Such an approximation might be the…

January 26, 2026
Perfect Clustering for Sparse Directed Stochastic Block Models

Perfect Clustering for Sparse Directed Stochastic Block Models arXiv:2601.16427v1 Announce Type: new Abstract: Exact recovery in stochastic block models (SBMs) is well understood in undirected settings, but remains considerably less developed for directed and sparse networks, particularly when the number of communities diverges. Spectral methods for directed SBMs often lack stability in asymmetric, low-degree regimes,…

January 26, 2026
Efficient Learning of Stationary Diffusions with Stein-type Discrepancies

Efficient Learning of Stationary Diffusions with Stein-type Discrepancies arXiv:2601.16597v1 Announce Type: new Abstract: Learning a stationary diffusion amounts to estimating the parameters of a stochastic differential equation whose stationary distribution matches a target distribution. We build on the recently introduced kernel deviation from stationarity (KDS), which enforces stationarity by evaluating expectations of the diffusion’s generator…

January 26, 2026
Towards Latent Diffusion Suitable For Text

Towards Latent Diffusion Suitable For Text arXiv:2601.16220v1 Announce Type: cross Abstract: Language diffusion models aim to improve sampling speed and coherence over autoregressive LLMs. We introduce Neural Flow Diffusion Models for language generation, an extension of NFDM that enables the straightforward application of continuous diffusion models to discrete state spaces. NFDM learns a multivariate forward…

January 26, 2026
Long-Term Probabilistic Forecast of Vegetation Conditions Using Climate Attributes in the Four Corners Region

Long-Term Probabilistic Forecast of Vegetation Conditions Using Climate Attributes in the Four Corners Region arXiv:2601.16347v1 Announce Type: cross Abstract: Weather conditions can drastically alter the state of crops and rangelands, and in turn, impact the incomes and food security of individuals worldwide. Satellite-based remote sensing offers an effective way to monitor vegetation and climate variables…

January 26, 2026
Weekly Entering & Transitioning – Thread 26 Jan, 2026 – 02 Feb, 2026

Weekly Entering & Transitioning – Thread 26 Jan, 2026 – 02 Feb, 2026 Welcome to this week’s entering & transitioning thread! This thread is for any questions about getting started, studying, or transitioning into the data science field. Topics include: Learning resources (e.g. books, tutorials, videos) Traditional education (e.g. schools, degrees, electives) Alternative education (e.g.…

January 26, 2026
SAM 3 vs. Specialist Models — A Performance Benchmark

SAM 3 vs. Specialist Models — A Performance Benchmark Why specialized models still hold the 30x speed advantage in production environments The post SAM 3 vs. Specialist Models — A Performance Benchmark appeared first on Towards Data Science. Pushpak Bhoge Go to original source

January 26, 2026
Azure ML vs. AWS SageMaker: A Deep Dive into Model Training — Part 1

Azure ML vs. AWS SageMaker: A Deep Dive into Model Training — Part 1 Compare Azure ML and AWS SageMaker for scalable model training, focusing on project setup, permission management, and data storage patterns, to align platform choices with existing cloud ecosystem and preferred MLOps workflows The post Azure ML vs. AWS SageMaker: A Deep…

January 26, 2026
How to Build a Neural Machine Translation System for a Low-Resource Language

How to Build a Neural Machine Translation System for a Low-Resource Language An introduction to neural machine translation The post How to Build a Neural Machine Translation System for a Low-Resource Language appeared first on Towards Data Science. Kaixuan Chen Go to original source

January 25, 2026
Air for Tomorrow: Mapping the Digital Air-Quality Landscape, from Repositories and Data Types to Starter Code

Air for Tomorrow: Mapping the Digital Air-Quality Landscape, from Repositories and Data Types to Starter Code Understand air quality: access the available data, interpret data types, and execute starter codes The post Air for Tomorrow: Mapping the Digital Air-Quality Landscape, from Repositories and Data Types to Starter Code appeared first on Towards Data Science. Prithviraj…

January 25, 2026
Optimizing Data Transfer in Distributed AI/ML Training Workloads

Optimizing Data Transfer in Distributed AI/ML Training Workloads A deep dive on data transfer bottlenecks, their identification, and their resolution with the help of NVIDIA Nsight™ Systems – part 3 The post Optimizing Data Transfer in Distributed AI/ML Training Workloads appeared first on Towards Data Science. Chaim Rand Go to original source

January 24, 2026
Achieving 5x Agentic Coding Performance with Few-Shot Prompting

Achieving 5x Agentic Coding Performance with Few-Shot Prompting Learn to leverage few-shot prompting to increase your LLMs performance The post Achieving 5x Agentic Coding Performance with Few-Shot Prompting appeared first on Towards Data Science. Eivind Kjosbakken Go to original source

January 24, 2026
Why the Sophistication of Your Prompt Correlates Almost Perfectly with the Sophistication of the Response, as Research by Anthropic Found

Why the Sophistication of Your Prompt Correlates Almost Perfectly with the Sophistication of the Response, as Research by Anthropic Found How prompt engineering has evolved, examined scientifically; and implications for the future of conversational AI tools The post Why the Sophistication of Your Prompt Correlates Almost Perfectly with the Sophistication of the Response, as Research…

January 24, 2026
From Transactions to Trends: Predict When a Customer Is About to Stop Buying

From Transactions to Trends: Predict When a Customer Is About to Stop Buying Customer churn is usually a gradual process, not a sudden event. In this post, we analyze monthly transaction trends and convert regression slopes into degrees to clearly identify declining purchase behavior. A small negative slope today can prevent a big revenue loss…

January 24, 2026
TDS Newsletter: Beyond Prompt Engineering: The New Frontiers of LLM Optimization

TDS Newsletter: Beyond Prompt Engineering: The New Frontiers of LLM Optimization Let’s zoom in on recent approaches that push AI-powered workflows to the next level The post TDS Newsletter: Beyond Prompt Engineering: The New Frontiers of LLM Optimization appeared first on Towards Data Science. TDS Editors Go to original source

January 24, 2026
Robust X-Learner: Breaking the Curse of Imbalance and Heavy Tails via Robust Cross-Imputation

Robust X-Learner: Breaking the Curse of Imbalance and Heavy Tails via Robust Cross-Imputation arXiv:2601.15360v1 Announce Type: new Abstract: Estimating Heterogeneous Treatment Effects (HTE) in industrial applications such as AdTech and healthcare presents a dual challenge: extreme class imbalance and heavy-tailed outcome distributions. While the X-Learner framework effectively addresses imbalance through cross-imputation, we demonstrate that it…

January 23, 2026