Category: data-science

The Data Team’s Survival Guide for the Next Era of Data

The Data Team’s Survival Guide for the Next Era of Data 6 pillars to declutter your stack, escape the service trap, and build the missing foundations for the new primary data consumer: the AI agent. The post The Data Team’s Survival Guide for the Next Era of Data appeared first on Towards Data Science. Mahdi…

March 7, 2026
How Human Work Will Remain Valuable in an AI World

How Human Work Will Remain Valuable in an AI World The Road to Reality — Episode 1 The post How Human Work Will Remain Valuable in an AI World appeared first on Towards Data Science. Favio Vázquez Go to original source

March 6, 2026
5 Ways to Implement Variable Discretization

5 Ways to Implement Variable Discretization An overview of powerful methods for transforming continuous variables into discrete ones The post 5 Ways to Implement Variable Discretization appeared first on Towards Data Science. Rukshan Pramoditha Go to original source

March 5, 2026
Stop Tuning Hyperparameters. Start Tuning Your Problem.

Stop Tuning Hyperparameters. Start Tuning Your Problem. 80% of ML projects fail from bad problem framing, not bad models. A 5-step protocol to define the right problem before you write training code. The post Stop Tuning Hyperparameters. Start Tuning Your Problem. appeared first on Towards Data Science. Kaushik Rajan Go to original source

March 5, 2026
RAG with Hybrid Search: How Does Keyword Search Work?

RAG with Hybrid Search: How Does Keyword Search Work? Understanding keyword search, TF-IDF, and BM25 The post RAG with Hybrid Search: How Does Keyword Search Work? appeared first on Towards Data Science. Maria Mouschoutzi Go to original source

March 5, 2026
Graph Coloring You Can See

Graph Coloring You Can See Visual intuition with Python The post Graph Coloring You Can See appeared first on Towards Data Science. Rhyd Lewis Go to original source

March 4, 2026
Why You Should Stop Writing Loops in Pandas

Why You Should Stop Writing Loops in Pandas How to think in columns, write faster code, and finally use Pandas like a professional The post Why You Should Stop Writing Loops in Pandas appeared first on Towards Data Science. Ibrahim Salami Go to original source

March 4, 2026
Scaling ML Inference on Databricks: Liquid or Partitioned? Salted or Not?

Scaling ML Inference on Databricks: Liquid or Partitioned? Salted or Not? A case study on techniques to maximize your clusters The post Scaling ML Inference on Databricks: Liquid or Partitioned? Salted or Not? appeared first on Towards Data Science. Hector Mejia Go to original source

March 1, 2026
The Gap Between Junior and Senior Data Scientists Isn’t Code

The Gap Between Junior and Senior Data Scientists Isn’t Code Why my obsession with complex algorithms was actually holding my career back. The post The Gap Between Junior and Senior Data Scientists Isn’t Code appeared first on Towards Data Science. Benjamin Nweke Go to original source

February 28, 2026
Designing Data and AI Systems That Hold Up in Production

Designing Data and AI Systems That Hold Up in Production A system-level perspective on architecture, agents, and responsible scale The post Designing Data and AI Systems That Hold Up in Production appeared first on Towards Data Science. TDS Editors Go to original source

February 27, 2026
Scaling Feature Engineering Pipelines with Feast and Ray

Scaling Feature Engineering Pipelines with Feast and Ray Utilizing feature stores like Feast and distributed compute frameworks like Ray in production machine learning systems The post Scaling Feature Engineering Pipelines with Feast and Ray appeared first on Towards Data Science. Kenneth Leung Go to original source

February 26, 2026
How to Define the Modeling Scope of an Internal Credit Risk Model

How to Define the Modeling Scope of an Internal Credit Risk Model Dataset construction for Internal Ratings-Based (IRB) Probability of Default (PD) models The post How to Define the Modeling Scope of an Internal Credit Risk Model appeared first on Towards Data Science. JUNIOR JUMBONG Go to original source

February 26, 2026
Decisioning at the Edge: Policy Matching at Scale

Decisioning at the Edge: Policy Matching at Scale Policy-to-Agency Optimization with PuLP The post Decisioning at the Edge: Policy Matching at Scale appeared first on Towards Data Science. Erika Gomes-Gonçalves Go to original source

February 25, 2026
Is the AI and Data Job Market Dead?

Is the AI and Data Job Market Dead? What you should be doing in the current job market The post Is the AI and Data Job Market Dead? appeared first on Towards Data Science. Egor Howell Go to original source

February 24, 2026
Architecting GPUaaS for Enterprise AI On-Prem

Architecting GPUaaS for Enterprise AI On-Prem Multi-tenancy, scheduling, and cost modeling on Kubernetes The post Architecting GPUaaS for Enterprise AI On-Prem appeared first on Towards Data Science. Joe Sasson Go to original source

February 22, 2026
From Monolith to Contract-Driven Data Mesh

From Monolith to Contract-Driven Data Mesh A pragmatic journey using website analytics as a real-world example The post From Monolith to Contract-Driven Data Mesh appeared first on Towards Data Science. Corné POTGIETER Go to original source

February 21, 2026
The Missing Curriculum: Essential Concepts For Data Scientists in the Age of AI Coding Agents

The Missing Curriculum: Essential Concepts For Data Scientists in the Age of AI Coding Agents AI can write the code, but you have to steer the ship. Master the knowledge to keep you relevant in the age of AI. The post The Missing Curriculum: Essential Concepts For Data Scientists in the Age of AI Coding…

February 20, 2026
Understanding the Chi-Square Test Beyond the Formula

Understanding the Chi-Square Test Beyond the Formula How categorical data becomes statistical evidence. The post Understanding the Chi-Square Test Beyond the Formula appeared first on Towards Data Science. Nikhil Dasari Go to original source

February 20, 2026
Why Every Analytics Engineer Needs to Understand Data Architecture

Why Every Analytics Engineer Needs to Understand Data Architecture Get the data architecture right, and everything else becomes easier. I know it sounds simple, but in reality, little nuances in designing your data architecture may have costly implications. This article provides a crash course on the architectures that shape your daily decisions – from relational…

February 19, 2026
Your First 90 Days as a Data Scientist

Your First 90 Days as a Data Scientist A practical onboarding checklist for building trust, business fluency, and data intuition The post Your First 90 Days as a Data Scientist appeared first on Towards Data Science. Yu Dong Go to original source

February 15, 2026
How to Leverage Explainable AI for Better Business Decisions

How to Leverage Explainable AI for Better Business Decisions Moving beyond the black box to turn complex model outputs into actionable organizational strategies. The post How to Leverage Explainable AI for Better Business Decisions appeared first on Towards Data Science. Rodrigo Almeida Go to original source

February 13, 2026
Building an AI Agent to Detect and Handle Anomalies in Time-Series Data

Building an AI Agent to Detect and Handle Anomalies in Time-Series Data Combining statistical detection with agentic decision-making The post Building an AI Agent to Detect and Handle Anomalies in Time-Series Data appeared first on Towards Data Science. MADHURA RAUT Go to original source

February 12, 2026
How to Model The Expected Value of Marketing Campaigns

How to Model The Expected Value of Marketing Campaigns The approach that takes companies to the next level of data maturity The post How to Model The Expected Value of Marketing Campaigns appeared first on Towards Data Science. Rodrigo Almeida Go to original source

February 11, 2026
What I Am Doing to Stay Relevant as a Senior Analytics Consultant in 2026

What I Am Doing to Stay Relevant as a Senior Analytics Consultant in 2026 Learn how to work with AI, while strengthening your unique human skills that technology cannot replace The post What I Am Doing to Stay Relevant as a Senior Analytics Consultant in 2026 appeared first on Towards Data Science. Rashi Desai Go…

February 8, 2026
Pydantic Performance: 4 Tips on How to Validate Large Amounts of Data Efficiently

Pydantic Performance: 4 Tips on How to Validate Large Amounts of Data Efficiently The real value lies in writing clearer code and using your tools right The post Pydantic Performance: 4 Tips on How to Validate Large Amounts of Data Efficiently appeared first on Towards Data Science. Mike Huls Go to original source

February 7, 2026
Why Is My Code So Slow? A Guide to Py-Spy Python Profiling

Why Is My Code So Slow? A Guide to Py-Spy Python Profiling Stop guessing and start diagnosing performance issues using Py-Spy The post Why Is My Code So Slow? A Guide to Py-Spy Python Profiling appeared first on Towards Data Science. Kenneth McCarthy Go to original source

February 6, 2026
The Rule Everyone Misses: How to Stop Confusing loc and iloc in Pandas

The Rule Everyone Misses: How to Stop Confusing loc and iloc in Pandas A simple mental model to remember when each one works (with examples that finally click). The post The Rule Everyone Misses: How to Stop Confusing loc and iloc in Pandas appeared first on Towards Data Science. Ibrahim Salami Go to original source

February 6, 2026
AWS vs. Azure: A Deep Dive into Model Training – Part 2

AWS vs. Azure: A Deep Dive into Model Training – Part 2 This article covers how Azure ML’s persistent, workspace-centric compute resources differ from AWS SageMaker’s on-demand, job-specific approach. Additionally, we explored environment customization options, from Azure’s curated environments and custom environments to SageMaker’s three level of customizations. The post AWS vs. Azure: A Deep…

February 5, 2026
Creating a Data Pipeline to Monitor Local Crime Trends

Creating a Data Pipeline to Monitor Local Crime Trends A walkthough of creating an ETL pipeline to extract local crime data and visualize it in Metabase. The post Creating a Data Pipeline to Monitor Local Crime Trends appeared first on Towards Data Science. Jimin Kang Go to original source

February 4, 2026
The Proximity of the Inception Score as an Evaluation Criterion

The Proximity of the Inception Score as an Evaluation Criterion The neighborhood of synthetic data The post The Proximity of the Inception Score as an Evaluation Criterion appeared first on Towards Data Science. Giuseppe Pio Cannata Go to original source

February 4, 2026
Building Systems That Survive Real Life

Building Systems That Survive Real Life Sara Nobrega on the transition from data science to AI engineering, using LLMs as a bridge to DevOps, and the one engineering skill junior data scientists need to stay competitive. The post Building Systems That Survive Real Life appeared first on Towards Data Science. TDS Editors Go to original…

February 3, 2026
Multi-Attribute Decision Matrices, Done Right

Multi-Attribute Decision Matrices, Done Right How to structure decisions, identify efficient options, and avoid misleading value metrics The post Multi-Attribute Decision Matrices, Done Right appeared first on Towards Data Science. Josiah DeValois Go to original source

January 31, 2026
Randomization Works in Experiments, Even Without Balance

Randomization Works in Experiments, Even Without Balance Randomization usually balances confounders in experiments, but what happens when it doesn’t? The post Randomization Works in Experiments, Even Without Balance appeared first on Towards Data Science. Jarom Hulet Go to original source

January 30, 2026
Federated Learning, Part 2: Implementation with the Flower Framework 🌼

Federated Learning, Part 2: Implementation with the Flower Framework 🌼 Implementing cross-silo federated learning step by step The post Federated Learning, Part 2: Implementation with the Flower Framework 🌼 appeared first on Towards Data Science. Parul Pandey Go to original source

January 29, 2026
I Ditched My Mouse: How I Control My Computer With Hand Gestures (In 60 Lines of Python)

I Ditched My Mouse: How I Control My Computer With Hand Gestures (In 60 Lines of Python) A step-by-step guide to building a “Minority Report”-style interface using OpenCV and MediaPipe The post I Ditched My Mouse: How I Control My Computer With Hand Gestures (In 60 Lines of Python) appeared first on Towards Data Science.…

January 29, 2026
Modeling Urban Walking Risk Using Spatial-Temporal Machine Learning

Modeling Urban Walking Risk Using Spatial-Temporal Machine Learning Estimating neighborhood-level pedestrian risk from real-world incident data The post Modeling Urban Walking Risk Using Spatial-Temporal Machine Learning appeared first on Towards Data Science. Aneesh Patil Go to original source

January 29, 2026
Data Science as Engineering: Foundations, Education, and Professional Identity

Data Science as Engineering: Foundations, Education, and Professional Identity Recognize data science as an engineering practice and structure education accordingly. The post Data Science as Engineering: Foundations, Education, and Professional Identity appeared first on Towards Data Science. Tom Narock Go to original source

January 28, 2026
From Connections to Meaning: Why Heterogeneous Graph Transformers (HGT) Change Demand Forecasting

From Connections to Meaning: Why Heterogeneous Graph Transformers (HGT) Change Demand Forecasting How relationship-aware graphs turn connected forecasts into operational insight The post From Connections to Meaning: Why Heterogeneous Graph Transformers (HGT) Change Demand Forecasting appeared first on Towards Data Science. Partha Sarkar Go to original source

January 28, 2026
Causal ML for the Aspiring Data Scientist

Causal ML for the Aspiring Data Scientist An accessible introduction to causal inference and ML The post Causal ML for the Aspiring Data Scientist appeared first on Towards Data Science. Ross Lauterbach Go to original source

January 27, 2026
Azure ML vs. AWS SageMaker: A Deep Dive into Model Training — Part 1

Azure ML vs. AWS SageMaker: A Deep Dive into Model Training — Part 1 Compare Azure ML and AWS SageMaker for scalable model training, focusing on project setup, permission management, and data storage patterns, to align platform choices with existing cloud ecosystem and preferred MLOps workflows The post Azure ML vs. AWS SageMaker: A Deep…

January 26, 2026
How to Build a Neural Machine Translation System for a Low-Resource Language

How to Build a Neural Machine Translation System for a Low-Resource Language An introduction to neural machine translation The post How to Build a Neural Machine Translation System for a Low-Resource Language appeared first on Towards Data Science. Kaixuan Chen Go to original source

January 25, 2026
Air for Tomorrow: Mapping the Digital Air-Quality Landscape, from Repositories and Data Types to Starter Code

Air for Tomorrow: Mapping the Digital Air-Quality Landscape, from Repositories and Data Types to Starter Code Understand air quality: access the available data, interpret data types, and execute starter codes The post Air for Tomorrow: Mapping the Digital Air-Quality Landscape, from Repositories and Data Types to Starter Code appeared first on Towards Data Science. Prithviraj…

January 25, 2026
From Transactions to Trends: Predict When a Customer Is About to Stop Buying

From Transactions to Trends: Predict When a Customer Is About to Stop Buying Customer churn is usually a gradual process, not a sudden event. In this post, we analyze monthly transaction trends and convert regression slopes into degrees to clearly identify declining purchase behavior. A small negative slope today can prevent a big revenue loss…

January 24, 2026
Stop Writing Messy Boolean Masks: 10 Elegant Ways to Filter Pandas DataFrames

Stop Writing Messy Boolean Masks: 10 Elegant Ways to Filter Pandas DataFrames Master the art of readable, high-performance data selection using .query(), .isin(), and advanced vectorized logic. The post Stop Writing Messy Boolean Masks: 10 Elegant Ways to Filter Pandas DataFrames appeared first on Towards Data Science. Ibrahim Salami Go to original source

January 23, 2026
What Other Industries Can Learn from Healthcare’s Knowledge Graphs

What Other Industries Can Learn from Healthcare’s Knowledge Graphs How shared meaning, evidence, and standards create durable semantic infrastructure The post What Other Industries Can Learn from Healthcare’s Knowledge Graphs appeared first on Towards Data Science. Steve Hedden Go to original source

January 23, 2026
Google Trends is Misleading You: How to Do Machine Learning with Google Trends Data

Google Trends is Misleading You: How to Do Machine Learning with Google Trends Data Google Trends is one of the most widely used tools for analysing human behaviour at scale. Journalists use it. Data scientists use it. Entire papers are built on it. But there is a fundamental property of Google Trends data that makes…

January 22, 2026
If You Want to Become a Data Scientist in 2026, Do This

If You Want to Become a Data Scientist in 2026, Do This Learn from my mistakes and fast track your data science career The post If You Want to Become a Data Scientist in 2026, Do This appeared first on Towards Data Science. Egor Howell Go to original source

January 22, 2026
Building a Self-Healing Data Pipeline That Fixes Its Own Python Errors

Building a Self-Healing Data Pipeline That Fixes Its Own Python Errors How I built a self-healing pipeline that automatically fixes bad CSVs, schema changes, and weird delimiters. The post Building a Self-Healing Data Pipeline That Fixes Its Own Python Errors appeared first on Towards Data Science. Benjamin Nweke Go to original source

January 22, 2026
A Case for the T-statistic

A Case for the T-statistic And how it compares to the run-of-the-mill z-score The post A Case for the T-statistic appeared first on Towards Data Science. Aniruddha Karajgi Go to original source

January 22, 2026
Does Calendar-Based Time-Intelligence Change Custom Logic?

Does Calendar-Based Time-Intelligence Change Custom Logic? Let’s look at calculating the moving average over time The post Does Calendar-Based Time-Intelligence Change Custom Logic? appeared first on Towards Data Science. Salvatore Cagliari Go to original source

January 21, 2026
Bridging the Gap Between Research and Readability with Marco Hening Tallarico

Bridging the Gap Between Research and Readability with Marco Hening Tallarico Diluting complex research, spotting silent data leaks, and why the best way to learn is often backwards. The post Bridging the Gap Between Research and Readability with Marco Hening Tallarico appeared first on Towards Data Science. TDS Editors Go to original source

January 20, 2026
Time Series Isn’t Enough: How Graph Neural Networks Change Demand Forecasting

Time Series Isn’t Enough: How Graph Neural Networks Change Demand Forecasting Why modeling SKUs as a network reveals what traditional forecasts miss The post Time Series Isn’t Enough: How Graph Neural Networks Change Demand Forecasting appeared first on Towards Data Science. Partha Sarkar Go to original source

January 20, 2026
Why Healthcare Leads in Knowledge Graphs

Why Healthcare Leads in Knowledge Graphs How science, regulation, collaboration, and public funding shaped the world’s most mature semantic infrastructure The post Why Healthcare Leads in Knowledge Graphs appeared first on Towards Data Science. Steve Hedden Go to original source

January 19, 2026
The Great Data Closure: Why Databricks and Snowflake Are Hitting Their Ceiling

The Great Data Closure: Why Databricks and Snowflake Are Hitting Their Ceiling Acquisitions, venture, and an increasingly competitive landscape all point to a market ceiling The post The Great Data Closure: Why Databricks and Snowflake Are Hitting Their Ceiling appeared first on Towards Data Science. Hugo Lu Go to original source

January 17, 2026
TDS Newsletter: Is It Time to Revisit RAG?

TDS Newsletter: Is It Time to Revisit RAG? Let’s make sense of the current state of retrieval-augmented generation The post TDS Newsletter: Is It Time to Revisit RAG? appeared first on Towards Data Science. TDS Editors Go to original source

January 17, 2026
The 2026 Goal Tracker: How I Built a Data-Driven Vision Board Using Python, Streamlit, and Neon

The 2026 Goal Tracker: How I Built a Data-Driven Vision Board Using Python, Streamlit, and Neon Designing a centralized system to track daily habits and long-term goals The post The 2026 Goal Tracker: How I Built a Data-Driven Vision Board Using Python, Streamlit, and Neon appeared first on Towards Data Science. Sabrine Bendimerad Go to…

January 16, 2026
Why Human-Centered Data Analytics Matters More Than Ever

Why Human-Centered Data Analytics Matters More Than Ever From optimizing metrics to designing meaning: putting people back into data-driven decisions The post Why Human-Centered Data Analytics Matters More Than Ever appeared first on Towards Data Science. Rashi Desai Go to original source

January 15, 2026
What Is a Knowledge Graph — and Why It Matters

What Is a Knowledge Graph — and Why It Matters How structured knowledge became healthcare’s quiet advantage The post What Is a Knowledge Graph — and Why It Matters appeared first on Towards Data Science. Steve Hedden Go to original source

January 15, 2026
An introduction to AWS Bedrock

An introduction to AWS Bedrock The how, why, what and where of Amazon’s LLM access layer The post An introduction to AWS Bedrock appeared first on Towards Data Science. Thomas Reid Go to original source

January 14, 2026
Under the Uzès Sun: When Historical Data Reveals the Climate Change

Under the Uzès Sun: When Historical Data Reveals the Climate Change Longer summers, milder winters: analysis of temperature trends in Uzès, France, year after year. The post Under the Uzès Sun: When Historical Data Reveals the Climate Change appeared first on Towards Data Science. Marc Polizzi Go to original source

January 14, 2026
Why Your ML Model Works in Training But Fails in Production

Why Your ML Model Works in Training But Fails in Production Hard lessons from building production ML systems where data leaks, defaults lie, populations shift, and time does not behave the way we expect. The post Why Your ML Model Works in Training But Fails in Production appeared first on Towards Data Science. Sudheer Singamsetty…

January 14, 2026
How AI Can Become Your Personal Language Tutor

How AI Can Become Your Personal Language Tutor How I used n8n to build AI study partners for learning Mandarin: vocabulary, listening, and pronunciation correction. The post How AI Can Become Your Personal Language Tutor appeared first on Towards Data Science. Samir Saci Go to original source

January 13, 2026
Why 90% Accuracy in Text-to-SQL is 100% Useless

Why 90% Accuracy in Text-to-SQL is 100% Useless The eternal promise of self-service analytics The post Why 90% Accuracy in Text-to-SQL is 100% Useless appeared first on Towards Data Science. Gary Zavaleta Go to original source

January 13, 2026
Federated Learning, Part 1: The Basics of Training Models Where the Data Lives

Federated Learning, Part 1: The Basics of Training Models Where the Data Lives Understanding the foundations of federated learning The post Federated Learning, Part 1: The Basics of Training Models Where the Data Lives appeared first on Towards Data Science. Parul Pandey Go to original source

January 11, 2026
Beyond the Flat Table: Building an Enterprise-Grade Financial Model in Power BI

Beyond the Flat Table: Building an Enterprise-Grade Financial Model in Power BI A step-by-step journey through data transformation, star schema modeling, and DAX variance analysis with lessons learned along the way. The post Beyond the Flat Table: Building an Enterprise-Grade Financial Model in Power BI appeared first on Towards Data Science. Ibrahim Salami Go to original source

January 11, 2026
Data Science Spotlight: Selected Problems from Advent of Code 2025

Data Science Spotlight: Selected Problems from Advent of Code 2025 Hands-on walkthroughs of problems and solution approaches that power real‑world data science use cases The post Data Science Spotlight: Selected Problems from Advent of Code 2025 appeared first on Towards Data Science. Chinmay Kakatkar Go to original source

January 10, 2026
Mastering Non-Linear Data: A Guide to Scikit-Learn’s SplineTransformer

Mastering Non-Linear Data: A Guide to Scikit-Learn’s SplineTransformer Forget stiff lines and wild polynomials. Discover why Splines are the “Goldilocks” of feature engineering, offering the perfect balance of flexibility and discipline for non-linear data using Scikit-Learn’s SplineTransformer. The post Mastering Non-Linear Data: A Guide to Scikit-Learn’s SplineTransformer appeared first on Towards Data Science. Gustavo Santos…

January 10, 2026
TDS Newsletter: December Must-Reads on GraphRAG, Data Contracts, and More

TDS Newsletter: December Must-Reads on GraphRAG, Data Contracts, and More Don’t miss our most popular articles of the previous month The post TDS Newsletter: December Must-Reads on GraphRAG, Data Contracts, and More appeared first on Towards Data Science. TDS Editors Go to original source

January 9, 2026
Retrieval for Time-Series: How Looking Back Improves Forecasts

Retrieval for Time-Series: How Looking Back Improves Forecasts Why Retrieval Helps in Time Series Forecasting We all know how it goes: Time-series data is tricky. Traditional forecasting models are unprepared for incidents like sudden market crashes, black swan events, or rare weather patterns. Even large fancy models like Chronos sometimes struggle because they haven’t dealt…

January 9, 2026
Faster Is Not Always Better: Choosing the Right PostgreSQL Insert Strategy in Python (+Benchmarks)

Faster Is Not Always Better: Choosing the Right PostgreSQL Insert Strategy in Python (+Benchmarks) PostgreSQL is fast. Whether your Python code can or should keep up depends on context. This article compares and benchmarks various insert strategies, focusing not on micro-benchmarks but on trade-offs between safety, abstraction, and throughput — and choosing the right tool…

January 9, 2026
I Evaluated Half a Million Credit Records with Federated Learning. Here’s What I Found

I Evaluated Half a Million Credit Records with Federated Learning. Here’s What I Found Why privacy breaks fairness at small scale—and how collaboration fixes both without sharing a single record The post I Evaluated Half a Million Credit Records with Federated Learning. Here’s What I Found appeared first on Towards Data Science. Arjun Kaarat Go…

January 8, 2026
Why Supply Chain is the Best Domain for Data Scientists in 2026 (And How to Learn It)

Why Supply Chain is the Best Domain for Data Scientists in 2026 (And How to Learn It) My take after 10 years in Supply Chain on why this can be an excellent playground for data scientists who want to see their skills valued. The post Why Supply Chain is the Best Domain for Data Scientists in…

January 8, 2026
Measuring What Matters with NeMo Agent Toolkit

Measuring What Matters with NeMo Agent Toolkit A practical guide to observability, evaluations, and model comparisons The post Measuring What Matters with NeMo Agent Toolkit appeared first on Towards Data Science. Mariya Mansurova Go to original source

January 7, 2026
The Best Data Scientists Are Always Learning

The Best Data Scientists Are Always Learning Part 2: Avoiding burnout, learning strategies and the superpower of solitude The post The Best Data Scientists Are Always Learning appeared first on Towards Data Science. Jarom Hulet Go to original source

January 7, 2026
Stop Blaming the Data: A Better Way to Handle Covariance Shift

Stop Blaming the Data: A Better Way to Handle Covariance Shift Instead of using shift as an excuse for poor performance, use Inverse Probability Weighting to estimate how your model should perform in the new environment The post Stop Blaming the Data: A Better Way to Handle Covariance Shift appeared first on Towards Data Science.…

January 6, 2026
How to Filter for Dates, Including or Excluding Future Dates, in Semantic Models

How to Filter for Dates, Including or Excluding Future Dates, in Semantic Models It is common to have either planning data or the previous year’s data displayed beyond today’s date. But future data can be confusing. How can I add a Slicer to show or hide future data? Let’s see how to do it. The…

January 5, 2026
Off-Beat Careers That Are the Future Of Data

Off-Beat Careers That Are the Future Of Data The unconventional career paths you need to explore The post Off-Beat Careers That Are the Future Of Data appeared first on Towards Data Science. Rashi Desai Go to original source

January 3, 2026
The Real Challenge in Data Storytelling: Getting Buy-In for Simplicity

The Real Challenge in Data Storytelling: Getting Buy-In for Simplicity What happens when your clear dashboard meets stakeholders who want everything on one screen The post The Real Challenge in Data Storytelling: Getting Buy-In for Simplicity appeared first on Towards Data Science. Benjamin Nweke Go to original source

January 3, 2026
EDA in Public (Part 3): RFM Analysis for Customer Segmentation in Pandas

EDA in Public (Part 3): RFM Analysis for Customer Segmentation in Pandas How to build, score, and interpret RFM segments step by step The post EDA in Public (Part 3): RFM Analysis for Customer Segmentation in Pandas appeared first on Towards Data Science. Ibrahim Salami Go to original source

January 2, 2026
What Advent of Code Has Taught Me About Data Science

What Advent of Code Has Taught Me About Data Science Five key learnings that I discovered during a programming challenge and how they apply to data science The post What Advent of Code Has Taught Me About Data Science appeared first on Towards Data Science. Jasper Schroeder Go to original source

January 1, 2026
The Machine Learning “Advent Calendar” Bonus 2: Gradient Descent Variants in Excel

The Machine Learning “Advent Calendar” Bonus 2: Gradient Descent Variants in Excel Gradient Descent, Momentum, RMSProp, and Adam all aim for the same minimum. They do not change the destination, only the path. Each method adds a mechanism that fixes a limitation of the previous one, making the movement faster, more stable, or more adaptive.…

January 1, 2026
The Machine Learning “Advent Calendar” Bonus 1: AUC in Excel

The Machine Learning “Advent Calendar” Bonus 1: AUC in Excel AUC measures how well a model ranks positives above negatives, independent of any chosen threshold. The post The Machine Learning “Advent Calendar” Bonus 1: AUC in Excel appeared first on Towards Data Science. angela shi Go to original source

December 31, 2025
Agents Under the Curve (AUC)

Agents Under the Curve (AUC) Towards understanding if your agentic solution is actually better The post Agents Under the Curve (AUC) appeared first on Towards Data Science. Lambert Leong Go to original source

December 31, 2025
How IntelliNode Automates Complex Workflows with Vibe Agents

How IntelliNode Automates Complex Workflows with Vibe Agents Many AI systems focus on isolated tasks or simple prompt engineering. This approach allowed us to build interesting applications from a single prompt, but we are starting to hit a limit. Simple prompting falls short when we tackle complex AI tasks that require multiple stages or enterprise…

December 28, 2025
How to Build an AI-Powered Weather ETL Pipeline with Databricks and GPT-4o: From API To Dashboard

How to Build an AI-Powered Weather ETL Pipeline with Databricks and GPT-4o: From API To Dashboard A step-by-step guide from weather API ETL to dashboard on Databricks The post How to Build an AI-Powered Weather ETL Pipeline with Databricks and GPT-4o: From API To Dashboard appeared first on Towards Data Science. Gustavo Santos Go to…

December 27, 2025
Keeping Probabilities Honest: The Jacobian Adjustment

Keeping Probabilities Honest: The Jacobian Adjustment An intuitive explanation of transforming random variables correctly. The post Keeping Probabilities Honest: The Jacobian Adjustment appeared first on Towards Data Science. Aniruddha Karajgi Go to original source

December 26, 2025
Why MAP and MRR Fail for Search Ranking (and What to Use Instead)

Why MAP and MRR Fail for Search Ranking (and What to Use Instead) MAP and MRR look intuitive, but they quietly break ranking evaluation. Here’s why these metrics mislead—and how better alternatives fix it. The post Why MAP and MRR Fail for Search Ranking (and What to Use Instead) appeared first on Towards Data Science.…

December 26, 2025
The Machine Learning “Advent Calendar” Day 24: Transformers for Text in Excel

The Machine Learning “Advent Calendar” Day 24: Transformers for Text in Excel An intuitive, step-by-step look at how Transformers use self-attention to turn static word embeddings into contextual representations, illustrated with simple examples and an Excel-friendly walkthrough. The post The Machine Learning “Advent Calendar” Day 24: Transformers for Text in Excel appeared first on Towards…

December 25, 2025
Is Your Model Time-Blind? The Case for Cyclical Feature Encoding

Is Your Model Time-Blind? The Case for Cyclical Feature Encoding How cyclical encoding improves machine learning prediction The post Is Your Model Time-Blind? The Case for Cyclical Feature Encoding appeared first on Towards Data Science. Gustavo Santos Go to original source

December 25, 2025
4 Techniques to Optimize AI Coding Efficiency

4 Techniques to Optimize AI Coding Efficiency Learn how to code more effectively using AI The post 4 Techniques to Optimize AI Coding Efficiency appeared first on Towards Data Science. Eivind Kjosbakken Go to original source

December 25, 2025
Bonferroni vs. Benjamini-Hochberg: Choosing Your P-Value Correction

Bonferroni vs. Benjamini-Hochberg: Choosing Your P-Value Correction Multiple hypothesis testing, P-values, and Monte Carlo The post Bonferroni vs. Benjamini-Hochberg: Choosing Your P-Value Correction appeared first on Towards Data Science. Marco Hening Tallarico Go to original source

December 25, 2025
The Machine Learning “Advent Calendar” Day 23: CNN in Excel

The Machine Learning “Advent Calendar” Day 23: CNN in Excel A step-by-step 1D CNN for text, built in Excel, where every filter, weight, and decision is fully visible. The post The Machine Learning “Advent Calendar” Day 23: CNN in Excel appeared first on Towards Data Science. angela shi Go to original source

December 24, 2025
Stop Retraining Blindly: Use PSI to Build a Smarter Monitoring Pipeline

Stop Retraining Blindly: Use PSI to Build a Smarter Monitoring Pipeline A data scientist’s guide to population stability index (PSI) The post Stop Retraining Blindly: Use PSI to Build a Smarter Monitoring Pipeline appeared first on Towards Data Science. Gustavo Santos Go to original source

December 24, 2025
Synergy in Clicks: Harsanyi Dividends for E-Commerce

Synergy in Clicks: Harsanyi Dividends for E-Commerce A brief overview of the math behind the Harsanyi Dividend and a real-world application in Streamlit The post Synergy in Clicks: Harsanyi Dividends for E-Commerce appeared first on Towards Data Science. Jacob Ingle Go to original source

December 24, 2025
The Machine Learning “Advent Calendar” Day 22: Embeddings in Excel

The Machine Learning “Advent Calendar” Day 22: Embeddings in Excel Understanding text embeddings through simple models and Excel The post The Machine Learning “Advent Calendar” Day 22: Embeddings in Excel appeared first on Towards Data Science. angela shi Go to original source

December 23, 2025
The Machine Learning “Advent Calendar” Day 21: Gradient Boosted Decision Tree Regressor in Excel

The Machine Learning “Advent Calendar” Day 21: Gradient Boosted Decision Tree Regressor in Excel Gradient descent in function space with decision trees The post The Machine Learning “Advent Calendar” Day 21: Gradient Boosted Decision Tree Regressor in Excel appeared first on Towards Data Science. angela shi Go to original source

December 23, 2025
The Machine Learning “Advent Calendar” Day 20: Gradient Boosted Linear Regression in Excel

The Machine Learning “Advent Calendar” Day 20: Gradient Boosted Linear Regression in Excel From Random Ensembles to Optimization: Gradient Boosting Explained The post The Machine Learning “Advent Calendar” Day 20: Gradient Boosted Linear Regression in Excel appeared first on Towards Data Science. angela shi Go to original source

December 23, 2025
EDA in Public (Part 2): Product Deep Dive & Time-Series Analysis in Pandas

EDA in Public (Part 2): Product Deep Dive & Time-Series Analysis in Pandas Learn how to analyze product performance, extract time-series features, and uncover key seasonal trends in your sales data. The post EDA in Public (Part 2): Product Deep Dive & Time-Series Analysis in Pandas appeared first on Towards Data Science. Ibrahim Salami Go to original source

December 21, 2025
The Machine Learning “Advent Calendar” Day 19: Bagging in Excel

The Machine Learning “Advent Calendar” Day 19: Bagging in Excel Understanding ensemble learning from first principles in Excel The post The Machine Learning “Advent Calendar” Day 19: Bagging in Excel appeared first on Towards Data Science. angela shi Go to original source

December 20, 2025
Agentic AI Swarm Optimization using Artificial Bee Colonization (ABC)

Agentic AI Swarm Optimization using Artificial Bee Colonization (ABC) Using Agentic AI prompts with the Artificial Bee Colony algorithm to enhance unsupervised clustering and optimization workflows. The post Agentic AI Swarm Optimization using Artificial Bee Colonization (ABC) appeared first on Towards Data Science. Gal Arav Go to original source

December 20, 2025