Category: machine-learning

What Makes Quantum Machine Learning “Quantum”?

What Makes Quantum Machine Learning “Quantum”? And where is it today? The post What Makes Quantum Machine Learning “Quantum”? appeared first on Towards Data Science. Sara A. Metwalli Go to original source

March 7, 2026
AI in Multiple GPUs: ZeRO & FSDP

AI in Multiple GPUs: ZeRO & FSDP Learn how Zero Redundancy Optimizer works, how to implement it from scratch, and how to use it in PyTorch The post AI in Multiple GPUs: ZeRO & FSDP appeared first on Towards Data Science. Lorenzo Cesconetto Go to original source

March 6, 2026
5 Ways to Implement Variable Discretization

5 Ways to Implement Variable Discretization An overview of powerful methods for transforming continuous variables into discrete ones The post 5 Ways to Implement Variable Discretization appeared first on Towards Data Science. Rukshan Pramoditha Go to original source

March 5, 2026
Stop Tuning Hyperparameters. Start Tuning Your Problem.

Stop Tuning Hyperparameters. Start Tuning Your Problem. 80% of ML projects fail from bad problem framing, not bad models. A 5-step protocol to define the right problem before you write training code. The post Stop Tuning Hyperparameters. Start Tuning Your Problem. appeared first on Towards Data Science. Kaushik Rajan Go to original source

March 5, 2026
RAG with Hybrid Search: How Does Keyword Search Work?

RAG with Hybrid Search: How Does Keyword Search Work? Understanding keyword search, TF-IDF, and BM25 The post RAG with Hybrid Search: How Does Keyword Search Work? appeared first on Towards Data Science. Maria Mouschoutzi Go to original source

March 5, 2026
I Quit My $130,000 ML Engineer Job After Learning 4 Lessons

I Quit My $130,000 ML Engineer Job After Learning 4 Lessons What they don’t tell you about “dream tech jobs” The post I Quit My $130,000 ML Engineer Job After Learning 4 Lessons appeared first on Towards Data Science. Egor Howell Go to original source

March 4, 2026
The Machine Learning Lessons I’ve Learned This Month

The Machine Learning Lessons I’ve Learned This Month February 2026: exchange with others, documentation, and MLOps The post The Machine Learning Lessons I’ve Learned This Month appeared first on Towards Data Science. Pascal Janetzky Go to original source

March 3, 2026
Scaling ML Inference on Databricks: Liquid or Partitioned? Salted or Not?

Scaling ML Inference on Databricks: Liquid or Partitioned? Salted or Not? A case study on techniques to maximize your clusters The post Scaling ML Inference on Databricks: Liquid or Partitioned? Salted or Not? appeared first on Towards Data Science. Hector Mejia Go to original source

March 1, 2026
The Gap Between Junior and Senior Data Scientists Isn’t Code

The Gap Between Junior and Senior Data Scientists Isn’t Code Why my obsession with complex algorithms was actually holding my career back. The post The Gap Between Junior and Senior Data Scientists Isn’t Code appeared first on Towards Data Science. Benjamin Nweke Go to original source

February 28, 2026
Designing Data and AI Systems That Hold Up in Production

Designing Data and AI Systems That Hold Up in Production A system-level perspective on architecture, agents, and responsible scale The post Designing Data and AI Systems That Hold Up in Production appeared first on Towards Data Science. TDS Editors Go to original source

February 27, 2026
A Generalizable MARL-LP Approach for Scheduling in Logistics

A Generalizable MARL-LP Approach for Scheduling in Logistics Part 1. Hybrid Solution for Dynamic Vehicle Routing — Context and Architecture The post A Generalizable MARL-LP Approach for Scheduling in Logistics appeared first on Towards Data Science. Alexander Levin Go to original source

February 27, 2026
Scaling Feature Engineering Pipelines with Feast and Ray

Scaling Feature Engineering Pipelines with Feast and Ray Utilizing feature stores like Feast and distributed compute frameworks like Ray in production machine learning systems The post Scaling Feature Engineering Pipelines with Feast and Ray appeared first on Towards Data Science. Kenneth Leung Go to original source

February 26, 2026
Aliasing in Audio, Easily Explained: From Wagon Wheels to Waveforms

Aliasing in Audio, Easily Explained: From Wagon Wheels to Waveforms Understanding the foundational distortion of digital audio from first principles, with worked examples and visual intuition The post Aliasing in Audio, Easily Explained: From Wagon Wheels to Waveforms appeared first on Towards Data Science. Aman Agrawal Go to original source

February 26, 2026
Is the AI and Data Job Market Dead?

Is the AI and Data Job Market Dead? What you should be doing in the current job market The post Is the AI and Data Job Market Dead? appeared first on Towards Data Science. Egor Howell Go to original source

February 24, 2026
AI in Multiple GPUs: Gradient Accumulation & Data Parallelism

AI in Multiple GPUs: Gradient Accumulation & Data Parallelism Learn and implement gradient accum and data parallelism from scratch in PyTorch The post AI in Multiple GPUs: Gradient Accumulation & Data Parallelism appeared first on Towards Data Science. Lorenzo Cesconetto Go to original source

February 24, 2026
Build Effective Internal Tooling with Claude Code

Build Effective Internal Tooling with Claude Code Use Claude Code to quickly build completely personalized applications The post Build Effective Internal Tooling with Claude Code appeared first on Towards Data Science. Eivind Kjosbakken Go to original source

February 24, 2026
Understanding the Chi-Square Test Beyond the Formula

Understanding the Chi-Square Test Beyond the Formula How categorical data becomes statistical evidence. The post Understanding the Chi-Square Test Beyond the Formula appeared first on Towards Data Science. Nikhil Dasari Go to original source

February 20, 2026
AlpamayoR1: Large Causal Reasoning Models for Autonomous Driving

AlpamayoR1: Large Causal Reasoning Models for Autonomous Driving All you need to know about Chain of Causation reasoning and the current state of Autonomous Driving! The post AlpamayoR1: Large Causal Reasoning Models for Autonomous Driving appeared first on Towards Data Science. Ryan Pégoud Go to original source

February 20, 2026
AI in Multiple GPUs: How GPUs Communicate

AI in Multiple GPUs: How GPUs Communicate A deep dive into the hardware infrastructure that enables multi-GPU communication for AI workloads The post AI in Multiple GPUs: How GPUs Communicate appeared first on Towards Data Science. Lorenzo Cesconetto Go to original source

February 20, 2026
Use OpenClaw to Make a Personal AI Assistant

Use OpenClaw to Make a Personal AI Assistant Learn how to set up OpenClaw as a personalized AI agent The post Use OpenClaw to Make a Personal AI Assistant appeared first on Towards Data Science. Eivind Kjosbakken Go to original source

February 18, 2026
Building a LangGraph Agent from Scratch

Building a LangGraph Agent from Scratch Everything you need to know to get started The post Building a LangGraph Agent from Scratch appeared first on Towards Data Science. Vyacheslav Efimov Go to original source

February 18, 2026
The Strangest Bottleneck in Modern LLMs

The Strangest Bottleneck in Modern LLMs Why insanely fast GPUs still can’t make LLMs feel instant The post The Strangest Bottleneck in Modern LLMs appeared first on Towards Data Science. Moulik Gupta Go to original source

February 17, 2026
The Evolving Role of the ML Engineer

The Evolving Role of the ML Engineer Stephanie Kirmer on the $200 billion investment bubble, how AI companies can rebuild trust, and how her day-to-day work changed with the rise of LLMs. The post The Evolving Role of the ML Engineer appeared first on Towards Data Science. TDS Editors Go to original source

February 14, 2026
Not All RecSys Problems Are Created Equal

Not All RecSys Problems Are Created Equal How baseline strength, churn, and subjectivity determine complexity The post Not All RecSys Problems Are Created Equal appeared first on Towards Data Science. Diogo Leitão Go to original source

February 12, 2026
The Machine Learning Lessons I’ve Learned Last Month

The Machine Learning Lessons I’ve Learned Last Month Delayed January: deadlines, downtimes, and flow times The post The Machine Learning Lessons I’ve Learned Last Month appeared first on Towards Data Science. Pascal Janetzky Go to original source

February 10, 2026
AWS vs. Azure: A Deep Dive into Model Training – Part 2

AWS vs. Azure: A Deep Dive into Model Training – Part 2 This article covers how Azure ML’s persistent, workspace-centric compute resources differ from AWS SageMaker’s on-demand, job-specific approach. Additionally, we explored environment customization options, from Azure’s curated environments and custom environments to SageMaker’s three level of customizations. The post AWS vs. Azure: A Deep…

February 5, 2026
Routing in a Sparse Graph: a Distributed Q-Learning Approach

Routing in a Sparse Graph: a Distributed Q-Learning Approach Distributed agents need only decide one move ahead. The post Routing in a Sparse Graph: a Distributed Q-Learning Approach appeared first on Towards Data Science. Sébastien Gilbert Go to original source

February 4, 2026
Building Systems That Survive Real Life

Building Systems That Survive Real Life Sara Nobrega on the transition from data science to AI engineering, using LLMs as a bridge to DevOps, and the one engineering skill junior data scientists need to stay competitive. The post Building Systems That Survive Real Life appeared first on Towards Data Science. TDS Editors Go to original…

February 3, 2026
Silicon Darwinism: Why Scarcity Is the Source of True Intelligence

Silicon Darwinism: Why Scarcity Is the Source of True Intelligence We are confusing “size” with “smart.” The next leap in artificial intelligence will not come from a larger data center, but from a more constrained environment. The post Silicon Darwinism: Why Scarcity Is the Source of True Intelligence appeared first on Towards Data Science. Aakash…

February 3, 2026
Distributed Reinforcement Learning for Scalable High-Performance Policy Optimization

Distributed Reinforcement Learning for Scalable High-Performance Policy Optimization Leveraging massive parallelism, asynchronous updates, and multi-machine training to match and exceed human-level performance The post Distributed Reinforcement Learning for Scalable High-Performance Policy Optimization appeared first on Towards Data Science. Sam Black Go to original source

February 2, 2026
How to Apply Agentic Coding to Solve Problems

How to Apply Agentic Coding to Solve Problems Learn how to efficiently solve problems with coding agents The post How to Apply Agentic Coding to Solve Problems appeared first on Towards Data Science. Eivind Kjosbakken Go to original source

February 1, 2026
Why Your Multi-Agent System is Failing: Escaping the 17x Error Trap of the “Bag of Agents”

Why Your Multi-Agent System is Failing: Escaping the 17x Error Trap of the “Bag of Agents” Hard-won lessons on how to scale agentic systems without scaling the chaos, including a taxonomy of core agent types. The post Why Your Multi-Agent System is Failing: Escaping the 17x Error Trap of the “Bag of Agents” appeared first…

January 31, 2026
On the Possibility of Small Networks for Physics-Informed Learning

On the Possibility of Small Networks for Physics-Informed Learning A new kind of hyperparameter study The post On the Possibility of Small Networks for Physics-Informed Learning appeared first on Towards Data Science. Conor Rowan Go to original source

January 31, 2026
Optimizing Vector Search: Why You Should Flatten Structured Data

Optimizing Vector Search: Why You Should Flatten Structured Data An analysis of how flattening structured data can boost precision and recall by up to 20% The post Optimizing Vector Search: Why You Should Flatten Structured Data appeared first on Towards Data Science. Oleg Tereshin Go to original source

January 30, 2026
Federated Learning, Part 2: Implementation with the Flower Framework 🌼

Federated Learning, Part 2: Implementation with the Flower Framework 🌼 Implementing cross-silo federated learning step by step The post Federated Learning, Part 2: Implementation with the Flower Framework 🌼 appeared first on Towards Data Science. Parul Pandey Go to original source

January 29, 2026
Machine Learning in Production? What This Really Means

Machine Learning in Production? What This Really Means From notebooks to real-world systems The post Machine Learning in Production? What This Really Means appeared first on Towards Data Science. Sabrine Bendimerad Go to original source

January 29, 2026
I Ditched My Mouse: How I Control My Computer With Hand Gestures (In 60 Lines of Python)

I Ditched My Mouse: How I Control My Computer With Hand Gestures (In 60 Lines of Python) A step-by-step guide to building a “Minority Report”-style interface using OpenCV and MediaPipe The post I Ditched My Mouse: How I Control My Computer With Hand Gestures (In 60 Lines of Python) appeared first on Towards Data Science.…

January 29, 2026
Modeling Urban Walking Risk Using Spatial-Temporal Machine Learning

Modeling Urban Walking Risk Using Spatial-Temporal Machine Learning Estimating neighborhood-level pedestrian risk from real-world incident data The post Modeling Urban Walking Risk Using Spatial-Temporal Machine Learning appeared first on Towards Data Science. Aneesh Patil Go to original source

January 29, 2026
From Connections to Meaning: Why Heterogeneous Graph Transformers (HGT) Change Demand Forecasting

From Connections to Meaning: Why Heterogeneous Graph Transformers (HGT) Change Demand Forecasting How relationship-aware graphs turn connected forecasts into operational insight The post From Connections to Meaning: Why Heterogeneous Graph Transformers (HGT) Change Demand Forecasting appeared first on Towards Data Science. Partha Sarkar Go to original source

January 28, 2026
How Convolutional Neural Networks Learn Musical Similarity

How Convolutional Neural Networks Learn Musical Similarity Learning audio embeddings with contrastive learning and deploying them in a real music recommendation app The post How Convolutional Neural Networks Learn Musical Similarity appeared first on Towards Data Science. Luke Stuckey Go to original source

January 27, 2026
Causal ML for the Aspiring Data Scientist

Causal ML for the Aspiring Data Scientist An accessible introduction to causal inference and ML The post Causal ML for the Aspiring Data Scientist appeared first on Towards Data Science. Ross Lauterbach Go to original source

January 27, 2026
SAM 3 vs. Specialist Models — A Performance Benchmark

SAM 3 vs. Specialist Models — A Performance Benchmark Why specialized models still hold the 30x speed advantage in production environments The post SAM 3 vs. Specialist Models — A Performance Benchmark appeared first on Towards Data Science. Pushpak Bhoge Go to original source

January 26, 2026
Azure ML vs. AWS SageMaker: A Deep Dive into Model Training — Part 1

Azure ML vs. AWS SageMaker: A Deep Dive into Model Training — Part 1 Compare Azure ML and AWS SageMaker for scalable model training, focusing on project setup, permission management, and data storage patterns, to align platform choices with existing cloud ecosystem and preferred MLOps workflows The post Azure ML vs. AWS SageMaker: A Deep…

January 26, 2026
How to Build a Neural Machine Translation System for a Low-Resource Language

How to Build a Neural Machine Translation System for a Low-Resource Language An introduction to neural machine translation The post How to Build a Neural Machine Translation System for a Low-Resource Language appeared first on Towards Data Science. Kaixuan Chen Go to original source

January 25, 2026
Why the Sophistication of Your Prompt Correlates Almost Perfectly with the Sophistication of the Response, as Research by Anthropic Found

Why the Sophistication of Your Prompt Correlates Almost Perfectly with the Sophistication of the Response, as Research by Anthropic Found How prompt engineering has evolved, examined scientifically; and implications for the future of conversational AI tools The post Why the Sophistication of Your Prompt Correlates Almost Perfectly with the Sophistication of the Response, as Research…

January 24, 2026
Google Trends is Misleading You: How to Do Machine Learning with Google Trends Data

Google Trends is Misleading You: How to Do Machine Learning with Google Trends Data Google Trends is one of the most widely used tools for analysing human behaviour at scale. Journalists use it. Data scientists use it. Entire papers are built on it. But there is a fundamental property of Google Trends data that makes…

January 22, 2026
Building a Self-Healing Data Pipeline That Fixes Its Own Python Errors

Building a Self-Healing Data Pipeline That Fixes Its Own Python Errors How I built a self-healing pipeline that automatically fixes bad CSVs, schema changes, and weird delimiters. The post Building a Self-Healing Data Pipeline That Fixes Its Own Python Errors appeared first on Towards Data Science. Benjamin Nweke Go to original source

January 22, 2026
A Case for the T-statistic

A Case for the T-statistic And how it compares to the run-of-the-mill z-score The post A Case for the T-statistic appeared first on Towards Data Science. Aniruddha Karajgi Go to original source

January 22, 2026
Bridging the Gap Between Research and Readability with Marco Hening Tallarico

Bridging the Gap Between Research and Readability with Marco Hening Tallarico Diluting complex research, spotting silent data leaks, and why the best way to learn is often backwards. The post Bridging the Gap Between Research and Readability with Marco Hening Tallarico appeared first on Towards Data Science. TDS Editors Go to original source

January 20, 2026
Time Series Isn’t Enough: How Graph Neural Networks Change Demand Forecasting

Time Series Isn’t Enough: How Graph Neural Networks Change Demand Forecasting Why modeling SKUs as a network reveals what traditional forecasts miss The post Time Series Isn’t Enough: How Graph Neural Networks Change Demand Forecasting appeared first on Towards Data Science. Partha Sarkar Go to original source

January 20, 2026
Data Poisoning in Machine Learning: Why and How People Manipulate Training Data

Data Poisoning in Machine Learning: Why and How People Manipulate Training Data Do you know where your data has been? The post Data Poisoning in Machine Learning: Why and How People Manipulate Training Data appeared first on Towards Data Science. Stephanie Kirmer Go to original source

January 18, 2026
A Geometric Method to Spot Hallucinations Without an LLM Judge

A Geometric Method to Spot Hallucinations Without an LLM Judge Imagine a flock of birds in flight. There’s no leader. No central command. Each bird aligns with its neighbors—matching direction, adjusting speed, maintaining coherence through purely local coordination. The result is global order emerging from local consistency. Now imagine one bird flying with the same…

January 18, 2026
When Shapley Values Break: A Guide to Robust Model Explainability

When Shapley Values Break: A Guide to Robust Model Explainability Shapley Values are one of the most common methods for explainability, yet they can be misleading. Discover how to overcome these limitations to achieve better insights. The post When Shapley Values Break: A Guide to Robust Model Explainability appeared first on Towards Data Science. Alon…

January 16, 2026
Topic Modeling Techniques for 2026: Seeded Modeling, LLM Integration, and Data Summaries

Topic Modeling Techniques for 2026: Seeded Modeling, LLM Integration, and Data Summaries Seeded topic modeling, integration with LLMs, and training on summarized data are the fresh parts of the NLP toolkit. The post Topic Modeling Techniques for 2026: Seeded Modeling, LLM Integration, and Data Summaries appeared first on Towards Data Science. Petr Koráb Go to…

January 15, 2026
Why Your ML Model Works in Training But Fails in Production

Why Your ML Model Works in Training But Fails in Production Hard lessons from building production ML systems where data leaks, defaults lie, populations shift, and time does not behave the way we expect. The post Why Your ML Model Works in Training But Fails in Production appeared first on Towards Data Science. Sudheer Singamsetty…

January 14, 2026
Why 90% Accuracy in Text-to-SQL is 100% Useless

Why 90% Accuracy in Text-to-SQL is 100% Useless The eternal promise of self-service analytics The post Why 90% Accuracy in Text-to-SQL is 100% Useless appeared first on Towards Data Science. Gary Zavaleta Go to original source

January 13, 2026
Optimizing Data Transfer in Batched AI/ML Inference Workloads

Optimizing Data Transfer in Batched AI/ML Inference Workloads A deep dive on data transfer bottlenecks, their identification, and their resolution with the help of NVIDIA Nsight™ Systems – part 2 The post Optimizing Data Transfer in Batched AI/ML Inference Workloads appeared first on Towards Data Science. Chaim Rand Go to original source

January 13, 2026
Federated Learning, Part 1: The Basics of Training Models Where the Data Lives

Federated Learning, Part 1: The Basics of Training Models Where the Data Lives Understanding the foundations of federated learning The post Federated Learning, Part 1: The Basics of Training Models Where the Data Lives appeared first on Towards Data Science. Parul Pandey Go to original source

January 11, 2026
How LLMs Handle Infinite Context With Finite Memory

How LLMs Handle Infinite Context With Finite Memory Achieving infinite context with 114× less memory The post How LLMs Handle Infinite Context With Finite Memory appeared first on Towards Data Science. Moulik Gupta Go to original source

January 10, 2026
Mastering Non-Linear Data: A Guide to Scikit-Learn’s SplineTransformer

Mastering Non-Linear Data: A Guide to Scikit-Learn’s SplineTransformer Forget stiff lines and wild polynomials. Discover why Splines are the “Goldilocks” of feature engineering, offering the perfect balance of flexibility and discipline for non-linear data using Scikit-Learn’s SplineTransformer. The post Mastering Non-Linear Data: A Guide to Scikit-Learn’s SplineTransformer appeared first on Towards Data Science. Gustavo Santos…

January 10, 2026
Teaching a Neural Network the Mandelbrot Set

Teaching a Neural Network the Mandelbrot Set And why Fourier features change everything The post Teaching a Neural Network the Mandelbrot Set appeared first on Towards Data Science. Carlos Redondo Go to original source

January 10, 2026
Retrieval for Time-Series: How Looking Back Improves Forecasts

Retrieval for Time-Series: How Looking Back Improves Forecasts Why Retrieval Helps in Time Series Forecasting We all know how it goes: Time-series data is tricky. Traditional forecasting models are unprepared for incidents like sudden market crashes, black swan events, or rare weather patterns. Even large fancy models like Chronos sometimes struggle because they haven’t dealt…

January 9, 2026
Feature Detection, Part 3: Harris Corner Detection

Feature Detection, Part 3: Harris Corner Detection Finding the most informative points in images The post Feature Detection, Part 3: Harris Corner Detection appeared first on Towards Data Science. Vyacheslav Efimov Go to original source

January 6, 2026
Stop Blaming the Data: A Better Way to Handle Covariance Shift

Stop Blaming the Data: A Better Way to Handle Covariance Shift Instead of using shift as an excuse for poor performance, use Inverse Probability Weighting to estimate how your model should perform in the new environment The post Stop Blaming the Data: A Better Way to Handle Covariance Shift appeared first on Towards Data Science.…

January 6, 2026
YOLOv1 Loss Function Walkthrough: Regression for All

YOLOv1 Loss Function Walkthrough: Regression for All An explanation of how YOLOv1 measures the correctness of its object detection and classification predictions The post YOLOv1 Loss Function Walkthrough: Regression for All appeared first on Towards Data Science. Muhammad Ardi Go to original source

January 6, 2026
Optimizing Data Transfer in AI/ML Workloads

Optimizing Data Transfer in AI/ML Workloads A deep dive on data transfer bottlenecks, their identification, and their resolution with the help of NVIDIA Nsight™ Systems The post Optimizing Data Transfer in AI/ML Workloads appeared first on Towards Data Science. Chaim Rand Go to original source

January 4, 2026
Drift Detection in Robust Machine Learning Systems

Drift Detection in Robust Machine Learning Systems A prerequisite for long-term success of machine learning systems The post Drift Detection in Robust Machine Learning Systems appeared first on Towards Data Science. Morris Stallmann Go to original source

January 3, 2026
EDA in Public (Part 3): RFM Analysis for Customer Segmentation in Pandas

EDA in Public (Part 3): RFM Analysis for Customer Segmentation in Pandas How to build, score, and interpret RFM segments step by step The post EDA in Public (Part 3): RFM Analysis for Customer Segmentation in Pandas appeared first on Towards Data Science. Ibrahim Salami Go to original source

January 2, 2026
Deep Reinforcement Learning: The Actor-Critic Method

Deep Reinforcement Learning: The Actor-Critic Method Robot friends collaborate to learn to fly a drone The post Deep Reinforcement Learning: The Actor-Critic Method appeared first on Towards Data Science. Vedant Jumle Go to original source

January 2, 2026
Chunk Size as an Experimental Variable in RAG Systems

Chunk Size as an Experimental Variable in RAG Systems Understanding retrieval in RAG systems by experimenting with different chunk sizes The post Chunk Size as an Experimental Variable in RAG Systems appeared first on Towards Data Science. Sarah Schürch Go to original source

January 1, 2026
The Machine Learning “Advent Calendar” Bonus 2: Gradient Descent Variants in Excel

The Machine Learning “Advent Calendar” Bonus 2: Gradient Descent Variants in Excel Gradient Descent, Momentum, RMSProp, and Adam all aim for the same minimum. They do not change the destination, only the path. Each method adds a mechanism that fixes a limitation of the previous one, making the movement faster, more stable, or more adaptive.…

January 1, 2026
The Machine Learning “Advent Calendar” Bonus 1: AUC in Excel

The Machine Learning “Advent Calendar” Bonus 1: AUC in Excel AUC measures how well a model ranks positives above negatives, independent of any chosen threshold. The post The Machine Learning “Advent Calendar” Bonus 1: AUC in Excel appeared first on Towards Data Science. angela shi Go to original source

December 31, 2025
Machine Learning vs AI Engineer: What Are the Differences?

Machine Learning vs AI Engineer: What Are the Differences? One of the most confusing questions in tech right now is: What is the difference between an AI engineer and a machine learning engineer? Both are six-figure jobs, but if you choose the wrong one, you could waste months of your career learning the wrong skills…

December 30, 2025
How to Facilitate Effective AI Programming

How to Facilitate Effective AI Programming How to ensure your coding agent has the same context as you The post How to Facilitate Effective AI Programming appeared first on Towards Data Science. Eivind Kjosbakken Go to original source

December 30, 2025
Implementing Vibe Proving with Reinforcement Learning

Implementing Vibe Proving with Reinforcement Learning How to make LLMs reason with verifiable, step-by-step logic (Part 2) The post Implementing Vibe Proving with Reinforcement Learning appeared first on Towards Data Science. Jacopo Tagliabue Go to original source

December 30, 2025
Breaking the Hardware Barrier: Software FP8 for Older GPUs

Breaking the Hardware Barrier: Software FP8 for Older GPUs Deep learning workloads are increasingly memory-bound, with GPU cores sitting idle while waiting for data transfers. FP8 precision solves this on newer hardware, but what about the millions of RTX 30 and 20 series GPUs already deployed? Feather demonstrates that software-based FP8 emulation through bitwise packing…

December 29, 2025
Exploring TabPFN: A Foundation Model Built for Tabular Data

Exploring TabPFN: A Foundation Model Built for Tabular Data Understanding the architecture, training pipeline and implementing TabPFN in practice The post Exploring TabPFN: A Foundation Model Built for Tabular Data appeared first on Towards Data Science. Parul Pandey Go to original source

December 28, 2025
Keeping Probabilities Honest: The Jacobian Adjustment

Keeping Probabilities Honest: The Jacobian Adjustment An intuitive explanation of transforming random variables correctly. The post Keeping Probabilities Honest: The Jacobian Adjustment appeared first on Towards Data Science. Aniruddha Karajgi Go to original source

December 26, 2025
The Machine Learning “Advent Calendar” Day 24: Transformers for Text in Excel

The Machine Learning “Advent Calendar” Day 24: Transformers for Text in Excel An intuitive, step-by-step look at how Transformers use self-attention to turn static word embeddings into contextual representations, illustrated with simple examples and an Excel-friendly walkthrough. The post The Machine Learning “Advent Calendar” Day 24: Transformers for Text in Excel appeared first on Towards…

December 25, 2025
Is Your Model Time-Blind? The Case for Cyclical Feature Encoding

Is Your Model Time-Blind? The Case for Cyclical Feature Encoding How cyclical encoding improves machine learning prediction The post Is Your Model Time-Blind? The Case for Cyclical Feature Encoding appeared first on Towards Data Science. Gustavo Santos Go to original source

December 25, 2025
The Machine Learning “Advent Calendar” Day 23: CNN in Excel

The Machine Learning “Advent Calendar” Day 23: CNN in Excel A step-by-step 1D CNN for text, built in Excel, where every filter, weight, and decision is fully visible. The post The Machine Learning “Advent Calendar” Day 23: CNN in Excel appeared first on Towards Data Science. angela shi Go to original source

December 24, 2025
Stop Retraining Blindly: Use PSI to Build a Smarter Monitoring Pipeline

Stop Retraining Blindly: Use PSI to Build a Smarter Monitoring Pipeline A data scientist’s guide to population stability index (PSI) The post Stop Retraining Blindly: Use PSI to Build a Smarter Monitoring Pipeline appeared first on Towards Data Science. Gustavo Santos Go to original source

December 24, 2025
The Machine Learning “Advent Calendar” Day 22: Embeddings in Excel

The Machine Learning “Advent Calendar” Day 22: Embeddings in Excel Understanding text embeddings through simple models and Excel The post The Machine Learning “Advent Calendar” Day 22: Embeddings in Excel appeared first on Towards Data Science. angela shi Go to original source

December 23, 2025
The Machine Learning “Advent Calendar” Day 21: Gradient Boosted Decision Tree Regressor in Excel

The Machine Learning “Advent Calendar” Day 21: Gradient Boosted Decision Tree Regressor in Excel Gradient descent in function space with decision trees The post The Machine Learning “Advent Calendar” Day 21: Gradient Boosted Decision Tree Regressor in Excel appeared first on Towards Data Science. angela shi Go to original source

December 23, 2025
The Machine Learning “Advent Calendar” Day 20: Gradient Boosted Linear Regression in Excel

The Machine Learning “Advent Calendar” Day 20: Gradient Boosted Linear Regression in Excel From Random Ensembles to Optimization: Gradient Boosting Explained The post The Machine Learning “Advent Calendar” Day 20: Gradient Boosted Linear Regression in Excel appeared first on Towards Data Science. angela shi Go to original source

December 23, 2025
Tools for Your LLM: a Deep Dive into MCP

Tools for Your LLM: a Deep Dive into MCP MCP is a key enabler into turning your LLM into an agent by providing it with tools to retrieve real-time information or perform actions. In this deep dive we cover how MCP works, when to use it, and what to watch out for. The post Tools…

December 22, 2025
Understanding the Generative AI User

Understanding the Generative AI User What do regular technology users think (and know) about AI? The post Understanding the Generative AI User appeared first on Towards Data Science. Stephanie Kirmer Go to original source

December 21, 2025
EDA in Public (Part 2): Product Deep Dive & Time-Series Analysis in Pandas

EDA in Public (Part 2): Product Deep Dive & Time-Series Analysis in Pandas Learn how to analyze product performance, extract time-series features, and uncover key seasonal trends in your sales data. The post EDA in Public (Part 2): Product Deep Dive & Time-Series Analysis in Pandas appeared first on Towards Data Science. Ibrahim Salami Go to original source

December 21, 2025
The Machine Learning “Advent Calendar” Day 19: Bagging in Excel

The Machine Learning “Advent Calendar” Day 19: Bagging in Excel Understanding ensemble learning from first principles in Excel The post The Machine Learning “Advent Calendar” Day 19: Bagging in Excel appeared first on Towards Data Science. angela shi Go to original source

December 20, 2025
The Machine Learning “Advent Calendar” Day 18: Neural Network Classifier in Excel

The Machine Learning “Advent Calendar” Day 18: Neural Network Classifier in Excel Understanding forward propagation and backpropagation through explicit formulas The post The Machine Learning “Advent Calendar” Day 18: Neural Network Classifier in Excel appeared first on Towards Data Science. angela shi Go to original source

December 19, 2025
A Practical Toolkit for Time Series Anomaly Detection, Using Python

A Practical Toolkit for Time Series Anomaly Detection, Using Python Here’s how to detect point anomalies within each series, and identify anomalous signals across the whole bank The post A Practical Toolkit for Time Series Anomaly Detection, Using Python appeared first on Towards Data Science. Piero Paialunga Go to original source

December 18, 2025
The Machine Learning “Advent Calendar” Day 17: Neural Network Regressor in Excel

The Machine Learning “Advent Calendar” Day 17: Neural Network Regressor in Excel Neural networks often feel like black boxes. In this article, we build a neural network regressor from scratch using only Excel formulas. By making every step explicit, from forward propagation to backpropagation, we show how a neural network learns to approximate non-linear functions…

December 18, 2025
The Machine Learning “Advent Calendar” Day 16: Kernel Trick in Excel

The Machine Learning “Advent Calendar” Day 16: Kernel Trick in Excel Kernel SVM often feels abstract, with kernels, dual formulations, and support vectors. In this article, we take a different path. Starting from Kernel Density Estimation, we build Kernel SVM step by step as a sum of local bells, weighted and selected by hinge loss,…

December 17, 2025
Lessons Learned After 8 Years of Machine Learning

Lessons Learned After 8 Years of Machine Learning Deep work, over-identification, sports, and blogging The post Lessons Learned After 8 Years of Machine Learning appeared first on Towards Data Science. Pascal Janetzky Go to original source

December 17, 2025
The Machine Learning “Advent Calendar” Day 15: SVM in Excel

The Machine Learning “Advent Calendar” Day 15: SVM in Excel Instead of starting with margins and geometry, this article builds the Support Vector Machine step by step from familiar models. By changing the loss function and reusing regularization, SVM appears naturally as a linear classifier trained by optimization. This perspective unifies logistic regression, SVM, and…

December 16, 2025
Lessons Learned from Upgrading to LangChain 1.0 in Production

Lessons Learned from Upgrading to LangChain 1.0 in Production What worked, what broke, and why I did it The post Lessons Learned from Upgrading to LangChain 1.0 in Production appeared first on Towards Data Science. Clara Chong Go to original source

December 16, 2025
The Machine Learning “Advent Calendar” Day 14: Softmax Regression in Excel

The Machine Learning “Advent Calendar” Day 14: Softmax Regression in Excel Softmax Regression is simply Logistic Regression extended to multiple classes. By computing one linear score per class and normalizing them with Softmax, we obtain multiclass probabilities without changing the core logic. The loss, the gradients, and the optimization remain the same. Only the number…

December 15, 2025
The Skills That Bridge Technical Work and Business Impact

The Skills That Bridge Technical Work and Business Impact In the Author Spotlight series, TDS Editors chat with members of our community about their career path in data science and AI, their writing, and their sources of inspiration. Today, we’re thrilled to share our conversation with Maria Mouschoutzi. Maria is a Data Analyst and Project…

December 15, 2025
The Machine Learning “Advent Calendar” Day 13: LASSO and Ridge Regression in Excel

The Machine Learning “Advent Calendar” Day 13: LASSO and Ridge Regression in Excel Ridge and Lasso regression are often perceived as more complex versions of linear regression. In reality, the prediction model remains exactly the same. What changes is the training objective. By adding a penalty on the coefficients, regularization forces the model to choose…

December 14, 2025
NeurIPS 2025 Best Paper Review: Qwen’s Systematic Exploration of Attention Gating

NeurIPS 2025 Best Paper Review: Qwen’s Systematic Exploration of Attention Gating This one little trick can bring about enhanced training stability, the use of larger learning rates and improved scaling properties The post NeurIPS 2025 Best Paper Review: Qwen’s Systematic Exploration of Attention Gating appeared first on Towards Data Science. Sean Moran Go to original…

December 14, 2025