Category: machine-learning

Actual Intelligence in the Age of AI

Actual Intelligence in the Age of AI Jarom Hulet on mastering fundamentals, hiring well, and deciding what to write about next The post Actual Intelligence in the Age of AI appeared first on Towards Data Science. TDS Editors Go to original source

October 1, 2025
The Machine Learning Lessons I’ve Learned This Month

The Machine Learning Lessons I’ve Learned This Month September 2025: library or self-made, Ditto and Launchbar, reading widely and deeply The post The Machine Learning Lessons I’ve Learned This Month appeared first on Towards Data Science. Pascal Janetzky Go to original source

October 1, 2025
Preparing Video Data for Deep Learning: Introducing Vid Prepper

Preparing Video Data for Deep Learning: Introducing Vid Prepper A guide to fast video data preprocessing for machine learning The post Preparing Video Data for Deep Learning: Introducing Vid Prepper appeared first on Towards Data Science. Jamie Petherbridge-Conroy Go to original source

September 30, 2025
Learning Triton One Kernel At a Time: Vector Addition

Learning Triton One Kernel At a Time: Vector Addition The basics of GPU programming, optimisation, and your first Triton kernel The post Learning Triton One Kernel At a Time: Vector Addition appeared first on Towards Data Science. Ryan Pégoud Go to original source

September 28, 2025
Building Fact-Checking Systems: Catching Repeating False Claims Before They Spread

Building Fact-Checking Systems: Catching Repeating False Claims Before They Spread How retrieval and ensemble methods make fact-checking faster, scalable, and more reliable in a digital world The post Building Fact-Checking Systems: Catching Repeating False Claims Before They Spread appeared first on Towards Data Science. Iva Pezo Go to original source

September 27, 2025
Why MissForest Fails in Prediction Tasks: A Key Limitation You Need to Keep in Mind

Why MissForest Fails in Prediction Tasks: A Key Limitation You Need to Keep in Mind Why the original MissForest algorithm cannot be directly applied for predictive modeling, and how MissForestPredict solves this problem The post Why MissForest Fails in Prediction Tasks: A Key Limitation You Need to Keep in Mind appeared first on Towards Data…

September 27, 2025
Decoding Nonlinear Signals In Large Observational Datasets

Decoding Nonlinear Signals In Large Observational Datasets Rain, snow, or something In between? The post Decoding Nonlinear Signals In Large Observational Datasets appeared first on Towards Data Science. Fraser King Go to original source

September 25, 2025
PyTorch Explained: From Automatic Differentiation to Training Custom Neural Networks

PyTorch Explained: From Automatic Differentiation to Training Custom Neural Networks Deep learning is shaping our world as we speak. In fact, it has been slowly revolutionizing software since the early 2010s. In 2025, PyTorch is at the forefront of this revolution, emerging as one of the most important libraries to train neural networks. Whether you…

September 25, 2025
Why Are Marketers Turning To Quasi Geo-Lift Experiments? (And How to Plan Them)

Why Are Marketers Turning To Quasi Geo-Lift Experiments? (And How to Plan Them) Are “quasi” geo-lift experiments the missing piece for your marketing science function? The post Why Are Marketers Turning To Quasi Geo-Lift Experiments? (And How to Plan Them) appeared first on Towards Data Science. Tomas Jancovic Go to original source

September 24, 2025
The Kolmogorov–Smirnov Statistic, Explained: Measuring Model Power in Credit Risk Modeling

The Kolmogorov–Smirnov Statistic, Explained: Measuring Model Power in Credit Risk Modeling Understanding how banks use the KS statistic in loan approvals. The post The Kolmogorov–Smirnov Statistic, Explained: Measuring Model Power in Credit Risk Modeling appeared first on Towards Data Science. Nikhil Dasari Go to original source

September 23, 2025
The Theory of Universal Computation: Bayesian Optimality, Solomonoff Induction & AIXI

The Theory of Universal Computation: Bayesian Optimality, Solomonoff Induction & AIXI Is it possible to build a perfect induction machine? The post The Theory of Universal Computation: Bayesian Optimality, Solomonoff Induction & AIXI appeared first on Towards Data Science. Angjelin Hila Go to original source

September 23, 2025
The SyncNet Research Paper, Clearly Explained

The SyncNet Research Paper, Clearly Explained A Deep Dive into “Out of Time: Automated Lip Sync in the Wild” The post The SyncNet Research Paper, Clearly Explained appeared first on Towards Data Science. Aman Agrawal Go to original source

September 21, 2025
Analysis of Sales Shift in Retail with Causal Impact: A Case Study at Carrefour

Analysis of Sales Shift in Retail with Causal Impact: A Case Study at Carrefour Applying causal inference to measure the effect of product unavailability on retail sales at Carrefour The post Analysis of Sales Shift in Retail with Causal Impact: A Case Study at Carrefour appeared first on Towards Data Science. Thanh Liêm NGUYEN Go…

September 18, 2025
ROC AUC Explained: A Beginner’s Guide to Evaluating Classification Models

ROC AUC Explained: A Beginner’s Guide to Evaluating Classification Models Understand how ROC curves and AUC help you go beyond accuracy with visuals and examples. The post ROC AUC Explained: A Beginner’s Guide to Evaluating Classification Models appeared first on Towards Data Science. Nikhil Dasari Go to original source

September 18, 2025
Building a Unified Intent Recognition Engine

Building a Unified Intent Recognition Engine How modular design can simplify and scale intent classification in enterprise AI systems The post Building a Unified Intent Recognition Engine appeared first on Towards Data Science. Shruti Tiwari Go to original source

September 17, 2025
A Visual Guide to Tuning Gradient Boosted Trees

A Visual Guide to Tuning Gradient Boosted Trees Introduction My previous posts looked at the bog-standard decision tree and the wonder of a random forest. Now, to complete the triplet, I’ll visually explore gradient boosted trees! There are a bunch of gradient boosted tree libraries, including XGBoost, CatBoost, and LightGBM. However, for this I’m going…

September 16, 2025
Learn How to Use Transformers with HuggingFace and SpaCy

Learn How to Use Transformers with HuggingFace and SpaCy Mastering NLP with spaCy: Part 4 The post Learn How to Use Transformers with HuggingFace and SpaCy appeared first on Towards Data Science. Marcello Politi Go to original source

September 16, 2025
How to Become a Machine Learning Engineer (Step-by-Step)

How to Become a Machine Learning Engineer (Step-by-Step) Your one-stop guide to becoming a machine learning engineer The post How to Become a Machine Learning Engineer (Step-by-Step) appeared first on Towards Data Science. Egor Howell Go to original source

September 16, 2025
No Peeking Ahead: Time-Aware Graph Fraud Detection

No Peeking Ahead: Time-Aware Graph Fraud Detection How to implement leak-free graph fraud detection The post No Peeking Ahead: Time-Aware Graph Fraud Detection appeared first on Towards Data Science. Erika G. Gonçalves Go to original source

September 15, 2025
Docling: The Document Alchemist

Docling: The Document Alchemist Why do we still wrestle with documents in 2025? Spend some time in any data-driven organisation, and you’ll encounter a host of PDFs, Word files, PowerPoints, half-scanned images, handwritten notes, and the occasional surprise CSV lurking in a SharePoint folder. Business and data analysts waste hours converting, splitting, and cajoling those formats…

September 13, 2025
Generalists Can Also Dig Deep

Generalists Can Also Dig Deep Ida Silfverskiöld on AI agents, RAG, evals, and what design choice ended up mattering more than expected The post Generalists Can Also Dig Deep appeared first on Towards Data Science. TDS Editors Go to original source

September 13, 2025
Is Your Training Data Representative? A Guide to Checking with PSI in Python

Is Your Training Data Representative? A Guide to Checking with PSI in Python Comparing Variable Distributions Between Two Datasets Using Population Stability Index (PSI) and Cramér’s V. The post Is Your Training Data Representative? A Guide to Checking with PSI in Python appeared first on Towards Data Science. JUNIOR JUMBONG Go to original source

September 11, 2025
Fighting Back Against Attacks in Federated Learning

Fighting Back Against Attacks in Federated Learning Lessons from a multi-node simulator The post Fighting Back Against Attacks in Federated Learning appeared first on Towards Data Science. Salman Toor Go to original source

September 11, 2025
How to Build Effective AI Agents to Process Millions of Requests

How to Build Effective AI Agents to Process Millions of Requests Learn how to build production ready systems using AI agents The post How to Build Effective AI Agents to Process Millions of Requests appeared first on Towards Data Science. Eivind Kjosbakken Go to original source

September 10, 2025
The Hungarian Algorithm and Its Applications in Computer Vision

The Hungarian Algorithm and Its Applications in Computer Vision Introduction Multi-object tracking (MOT) is a task in which an algorithm must detect and track multiple objects in a video. Most known algorithms are based on using simple detectors (e.g. YOLO) designed for processing individual images. The overall method involves separately using a detector on consecutive video…

September 10, 2025
How to Context Engineer to Optimize Question Answering Pipelines

How to Context Engineer to Optimize Question Answering Pipelines Learn how to apply context engineering to enhance your question answering systems. The post How to Context Engineer to Optimize Question Answering Pipelines appeared first on Towards Data Science. Eivind Kjosbakken Go to original source

September 6, 2025
Zero-Inflated Data: A Comparison of Regression Models

Zero-Inflated Data: A Comparison of Regression Models How to detect it and which model to choose. The post Zero-Inflated Data: A Comparison of Regression Models appeared first on Towards Data Science. Arnaud Capitaine Go to original source

September 6, 2025
Should We Use LLMs As If They Were Swiss Knives?

Should We Use LLMs As If They Were Swiss Knives? A logic game performance comparison between popular LLMs and a custom-made algorithm The post Should We Use LLMs As If They Were Swiss Knives? appeared first on Towards Data Science. Nicolas Garcia Aramouni Go to original source

September 5, 2025
A Visual Guide to Tuning Random Forest Hyperparameters

A Visual Guide to Tuning Random Forest Hyperparameters How hyperparameter tuning visually changes random forests The post A Visual Guide to Tuning Random Forest Hyperparameters appeared first on Towards Data Science. James Gibbins Go to original source

September 5, 2025
MobileNetV1 Paper Walkthrough: The Tiny Giant

MobileNetV1 Paper Walkthrough: The Tiny Giant Understanding and implementing MobileNetV1 from scratch with PyTorch The post MobileNetV1 Paper Walkthrough: The Tiny Giant appeared first on Towards Data Science. Muhammad Ardi Go to original source

September 5, 2025
Using LangGraph and MCP Servers to Create My Own Voice Assistant

Using LangGraph and MCP Servers to Create My Own Voice Assistant Built over 14 days, all locally run, no API keys, cloud services, or subscription fees. The post Using LangGraph and MCP Servers to Create My Own Voice Assistant appeared first on Towards Data Science. Benjamin Lee Go to original source

September 5, 2025
How to Scale Your AI Search to Handle 10M Queries with 5 Powerful Techniques

How to Scale Your AI Search to Handle 10M Queries with 5 Powerful Techniques Optimize your AI search with RAG, contextual retrieval and evaluations The post How to Scale Your AI Search to Handle 10M Queries with 5 Powerful Techniques appeared first on Towards Data Science. Eivind Kjosbakken Go to original source

September 3, 2025
What is Universality in LLMs? How to Find Universal Neurons

What is Universality in LLMs? How to Find Universal Neurons How independently trained transformers form same the neurons The post What is Universality in LLMs? How to Find Universal Neurons appeared first on Towards Data Science. Shuyang Go to original source

September 3, 2025
3 Greedy Algorithms for Decision Trees, Explained with Examples

3 Greedy Algorithms for Decision Trees, Explained with Examples Learn the inner workings of decision trees The post 3 Greedy Algorithms for Decision Trees, Explained with Examples appeared first on Towards Data Science. Kuriko Iwai Go to original source

September 3, 2025
The Generalist: The New All-Around Type of Data Professional?

The Generalist: The New All-Around Type of Data Professional? Is over-specialization ending and are data generalists on the rise? The post The Generalist: The New All-Around Type of Data Professional? appeared first on Towards Data Science. Loizos Loizou Go to original source

September 2, 2025
The Machine Learning Lessons I’ve Learned This Month

The Machine Learning Lessons I’ve Learned This Month August 2025: logging, lab notebooks, overnight runs The post The Machine Learning Lessons I’ve Learned This Month appeared first on Towards Data Science. Pascal Janetzky Go to original source

September 1, 2025
Marginal Effect of Hyperparameter Tuning with XGBoost

Marginal Effect of Hyperparameter Tuning with XGBoost Demystifying Bayesian hyperparameter optimization and comparing hyperparameter tuning paradigms The post Marginal Effect of Hyperparameter Tuning with XGBoost appeared first on Towards Data Science. Noah Swan Go to original source

August 30, 2025
Toward Digital Well-Being: Using Generative AI to Detect and Mitigate Bias in Social Networks

Toward Digital Well-Being: Using Generative AI to Detect and Mitigate Bias in Social Networks This research answered the question: How can machine learning and artificial intelligence help us to unlearn bias? The post Toward Digital Well-Being: Using Generative AI to Detect and Mitigate Bias in Social Networks appeared first on Towards Data Science. Celia Banks…

August 30, 2025
Stepwise Selection Made Simple: Improve Your Regression Models in Python

Stepwise Selection Made Simple: Improve Your Regression Models in Python Dimensionality reduction in linear regression: classical stepwise methods and a Python application on real-world data The post Stepwise Selection Made Simple: Improve Your Regression Models in Python appeared first on Towards Data Science. JUNIOR JUMBONG Go to original source

August 29, 2025
A Visual Guide to Tuning Decision-Tree Hyperparameters

A Visual Guide to Tuning Decision-Tree Hyperparameters How hyperparameter tuning visually changes decision trees The post A Visual Guide to Tuning Decision-Tree Hyperparameters appeared first on Towards Data Science. James Gibbins Go to original source

August 29, 2025
Air for Tomorrow: Why Openness in Air Quality Research and Implementation Matters for Global Equity

Air for Tomorrow: Why Openness in Air Quality Research and Implementation Matters for Global Equity Understand how open source can help you unravel air quality The post Air for Tomorrow: Why Openness in Air Quality Research and Implementation Matters for Global Equity appeared first on Towards Data Science. Prithviraj Pramanik Go to original source

August 29, 2025
Everything I Studied to Become a Machine Learning Engineer (No CS Background)

Everything I Studied to Become a Machine Learning Engineer (No CS Background) The books, courses, and resources I used in my journey. The post Everything I Studied to Become a Machine Learning Engineer (No CS Background) appeared first on Towards Data Science. Egor Howell Go to original source

August 28, 2025
Time Series Forecasting Made Simple (Part 4.1): Understanding Stationarity in a Time Series

Time Series Forecasting Made Simple (Part 4.1): Understanding Stationarity in a Time Series An intuitive guide to stationarity in a time series The post Time Series Forecasting Made Simple (Part 4.1): Understanding Stationarity in a Time Series appeared first on Towards Data Science. Nikhil Dasari Go to original source

August 28, 2025
How to Develop Powerful Internal LLM Benchmarks

How to Develop Powerful Internal LLM Benchmarks Learn how to compare LLMs using your own interal benchmark The post How to Develop Powerful Internal LLM Benchmarks appeared first on Towards Data Science. Eivind Kjosbakken Go to original source

August 27, 2025
Using Google’s LangExtract and Gemma for Structured Data Extraction

Using Google’s LangExtract and Gemma for Structured Data Extraction Extracting structured information effectively and accurately from long unstructured text with LangExtract and LLMs The post Using Google’s LangExtract and Gemma for Structured Data Extraction appeared first on Towards Data Science. Kenneth Leung Go to original source

August 27, 2025
Positional Embeddings in Transformers: A Math Guide to RoPE & ALiBi

Positional Embeddings in Transformers: A Math Guide to RoPE & ALiBi Learn APE, RoPE, and ALiBi positional embeddings for GPT — intuitions, math, PyTorch code, and experiments on TinyStories The post Positional Embeddings in Transformers: A Math Guide to RoPE & ALiBi appeared first on Towards Data Science. Sathya Krishnan Suresh Go to original source

August 27, 2025
How to Benchmark Classical Machine Learning Workloads on Google Cloud

How to Benchmark Classical Machine Learning Workloads on Google Cloud Harnessing CPUs for Practical, Cost-Effective Machine Learning The post How to Benchmark Classical Machine Learning Workloads on Google Cloud appeared first on Towards Data Science. Ehssan Khan Go to original source

August 26, 2025
Three Essential Hyperparameter Tuning Techniques for Better Machine Learning Models

Three Essential Hyperparameter Tuning Techniques for Better Machine Learning Models Learn how to optimize your ML models for better results The post Three Essential Hyperparameter Tuning Techniques for Better Machine Learning Models appeared first on Towards Data Science. Rukshan Pramoditha Go to original source

August 23, 2025
Cracking the Density Code: Why MAF Flows Where KDE Stalls

Cracking the Density Code: Why MAF Flows Where KDE Stalls Learn why autoregressive flows are the superior density estimation tool for high-dimensional data The post Cracking the Density Code: Why MAF Flows Where KDE Stalls appeared first on Towards Data Science. Zackary Nay Go to original source

August 23, 2025
How to Perform Comprehensive Large Scale LLM Validation

How to Perform Comprehensive Large Scale LLM Validation Learn how to validate large scale LLM applications The post How to Perform Comprehensive Large Scale LLM Validation appeared first on Towards Data Science. Eivind Kjosbakken Go to original source

August 22, 2025
What If I Had AI in 2020: Rent The Runway Dynamic Pricing Model

What If I Had AI in 2020: Rent The Runway Dynamic Pricing Model Ever wondered how different things might have been if ChatGPT had existed at the start of Covid? Especially for data scientists who had to update their forecast models? The post What If I Had AI in 2020: Rent The Runway Dynamic Pricing…

August 22, 2025
Designing Trustworthy ML Models: Alan & Aida Discover Monotonicity in Machine Learning

Designing Trustworthy ML Models: Alan & Aida Discover Monotonicity in Machine Learning Accuracy alone doesn’t guarantee trustworthiness. Monotonicity ensures predictions align with common sense and business rules. The post Designing Trustworthy ML Models: Alan & Aida Discover Monotonicity in Machine Learning appeared first on Towards Data Science. Mehdi Mohammadi Go to original source

August 22, 2025
Smarter Model Tuning: An AI Agent with LangGraph + Streamlit That Boosts ML Performance

Smarter Model Tuning: An AI Agent with LangGraph + Streamlit That Boosts ML Performance Automating model tuning in Python with Gemini, LangGraph, and Streamlit for regression and classification improvements The post Smarter Model Tuning: An AI Agent with LangGraph + Streamlit That Boosts ML Performance appeared first on Towards Data Science. Gustavo Santos Go to…

August 21, 2025
Help Your Model Learn the True Signal

Help Your Model Learn the True Signal An algorithm-agnostic approach inspired by Cook’s distance The post Help Your Model Learn the True Signal appeared first on Towards Data Science. Mena Wang Go to original source

August 20, 2025
Advanced Prompt Engineering for Data Science Projects

Advanced Prompt Engineering for Data Science Projects Part 2: Prompt Engineering for Features, Modeling, and Evaluation The post Advanced Prompt Engineering for Data Science Projects appeared first on Towards Data Science. Sara Nobrega Go to original source

August 20, 2025
Can LangExtract Turn Messy Clinical Notes into Structured Data?

Can LangExtract Turn Messy Clinical Notes into Structured Data? Turning raw clinical notes into structured entities with LLMs. The post Can LangExtract Turn Messy Clinical Notes into Structured Data? appeared first on Towards Data Science. Parul Pandey Go to original source

August 19, 2025
Modular Arithmetic in Data Science

Modular Arithmetic in Data Science Modular arithmetic is a mathematical system where numbers cycle back to the beginning after reaching a value called the modulus. The system is often referred to as “clock arithmetic” due to its similarity to how analog 12-hour clocks represent time. This article provides a conceptual overview of modular arithmetic and…

August 19, 2025
Maximizing AI/ML Model Performance with PyTorch Compilation

Maximizing AI/ML Model Performance with PyTorch Compilation Since its inception in PyTorch 2.0 in March 2023, the evolution of torch.compile has been one of the most exciting things to follow. Given that PyTorch’s popularity was due to its “Pythonic” nature, its ease of use, and its line-by-line (a.k.a., eager) execution, the success of a just-in-time (JIT) graph…

August 19, 2025
How to Create Powerful LLM Applications with Context Engineering

How to Create Powerful LLM Applications with Context Engineering Improve your LLM by optimizing its context The post How to Create Powerful LLM Applications with Context Engineering appeared first on Towards Data Science. Eivind Kjosbakken Go to original source

August 19, 2025
“My biggest lesson was realizing that domain expertise matters more than algorithmic complexity.“

“My biggest lesson was realizing that domain expertise matters more than algorithmic complexity.“ Claudia Ng reflects on real-world ML lessons, mentoring newcomers, and her journey from corporate ML to freelance AI. The post “My biggest lesson was realizing that domain expertise matters more than algorithmic complexity.“ appeared first on Towards Data Science. TDS Editors Go…

August 15, 2025
How to Use LLMs for Powerful Automatic Evaluations

How to Use LLMs for Powerful Automatic Evaluations A beginner-friendly introduction to LLM-as-a-Judge The post How to Use LLMs for Powerful Automatic Evaluations appeared first on Towards Data Science. Eivind Kjosbakken Go to original source

August 14, 2025
Coconut: A Framework for Latent Reasoning in LLMs

Coconut: A Framework for Latent Reasoning in LLMs Explaining Coconut (Training Large Language Models to Reason in a Continuous Latent Space) in simple terms The post Coconut: A Framework for Latent Reasoning in LLMs appeared first on Towards Data Science. Youssef Farag Go to original source

August 13, 2025
A Refined Training Recipe for Fine-Grained Visual Classification

A Refined Training Recipe for Fine-Grained Visual Classification How FGVC aims to recognize images belonging to multiple subordinate categories of a super-category The post A Refined Training Recipe for Fine-Grained Visual Classification appeared first on Towards Data Science. Ahmed Belgacem Go to original source

August 13, 2025
Fine-Tune Your Topic Modeling Workflow with BERTopic

Fine-Tune Your Topic Modeling Workflow with BERTopic Learn how to fine-tune BERTopic settings for more focused, reproducible, and interpretable results The post Fine-Tune Your Topic Modeling Workflow with BERTopic appeared first on Towards Data Science. Tiffany Chen Go to original source

August 13, 2025
Estimating from No Data: Deriving a Continuous Score from Categories

Estimating from No Data: Deriving a Continuous Score from Categories A walk-through of and the maths behind using low-capacity networks to acquire fine-grained scoring when only categorical labelling is available for training. We use it to predict the severity of an infection on a scale based on information on just rough outcomes in previous cases.…

August 12, 2025
Introducing Google’s LangExtract tool

Introducing Google’s LangExtract tool Do RAG without doing RAG with this powerful new NLP and data extraction library The post Introducing Google’s LangExtract tool appeared first on Towards Data Science. Thomas Reid Go to original source

August 12, 2025
How to Design Machine Learning Experiments — the Right Way

How to Design Machine Learning Experiments — the Right Way The key to successful ML projects isn’t always more resources The post How to Design Machine Learning Experiments — the Right Way appeared first on Towards Data Science. TDS Editors Go to original source

August 9, 2025
How to Write Insightful Technical Articles

How to Write Insightful Technical Articles Learn how to write informative technical articles The post How to Write Insightful Technical Articles appeared first on Towards Data Science. Eivind Kjosbakken Go to original source

August 9, 2025
Demystifying Cosine Similarity

Demystifying Cosine Similarity Mathematical intuition and practical considerations for NLP scenarios The post Demystifying Cosine Similarity appeared first on Towards Data Science. Chinmay Kakatkar Go to original source

August 9, 2025
Time Series Forecasting Made Simple (Part 3.2): A Deep Dive into LOESS-Based Smoothing

Time Series Forecasting Made Simple (Part 3.2): A Deep Dive into LOESS-Based Smoothing Explore how STL uses LOESS smoothing to extract trend and seasonal components. The post Time Series Forecasting Made Simple (Part 3.2): A Deep Dive into LOESS-Based Smoothing appeared first on Towards Data Science. Nikhil Dasari Go to original source

August 8, 2025
Finding Golden Examples: A Smarter Approach to In-Context Learning

Finding Golden Examples: A Smarter Approach to In-Context Learning From random example selection to systematic AuPair generation — how to make your LLM prompts actually work The post Finding Golden Examples: A Smarter Approach to In-Context Learning appeared first on Towards Data Science. Sudheer Singh Go to original source

August 8, 2025
The Channel-Wise Attention | Squeeze and Excitation

The Channel-Wise Attention | Squeeze and Excitation Applying the Squeeze and Excitation module on ResNeXt using PyTorch The post The Channel-Wise Attention | Squeeze and Excitation appeared first on Towards Data Science. Muhammad Ardi Go to original source

August 8, 2025
How I Won the “Mostly AI” Synthetic Data Challenge

How I Won the “Mostly AI” Synthetic Data Challenge A deep dive into how post-processing can supercharge synthetic data generation The post How I Won the “Mostly AI” Synthetic Data Challenge appeared first on Towards Data Science. Daniel Gärber Go to original source

August 7, 2025
Context Engineering — A Comprehensive Hands-On Tutorial with DSPy

Context Engineering — A Comprehensive Hands-On Tutorial with DSPy Let’s dissect the art and science of context engineering, one module at a time! The post Context Engineering — A Comprehensive Hands-On Tutorial with DSPy appeared first on Towards Data Science. Avishek Biswas Go to original source

August 6, 2025
Things I Wish I Had Known Before Starting ML

Things I Wish I Had Known Before Starting ML Part 2: Guardrails, research code, reading The post Things I Wish I Had Known Before Starting ML appeared first on Towards Data Science. Pascal Janetzky Go to original source

August 6, 2025
Stellar Flare Detection and Prediction Using Clustering and Machine Learning

Stellar Flare Detection and Prediction Using Clustering and Machine Learning Combining unsupervised clustering with supervised learning to detect and predict stellar flares The post Stellar Flare Detection and Prediction Using Clustering and Machine Learning appeared first on Towards Data Science. Diksha Sen Chaudhury Go to original source

August 6, 2025
Exploratory Data Analysis: Gamma Spectroscopy in Python (Part 3)

Exploratory Data Analysis: Gamma Spectroscopy in Python (Part 3) Let’s observe the matter on the atomic level The post Exploratory Data Analysis: Gamma Spectroscopy in Python (Part 3) appeared first on Towards Data Science. Dmitrii Eliuseev Go to original source

August 6, 2025
Mastering NLP with spaCy – Part 2

Mastering NLP with spaCy – Part 2 POS tagging, dependency parser and named entity recognition. The post Mastering NLP with spaCy – Part 2 appeared first on Towards Data Science. Marcello Politi Go to original source

August 2, 2025
How Computers “See” Molecules

How Computers “See” Molecules Generative Molecular Design (Part 1): common molecular representations in data science. The post How Computers “See” Molecules appeared first on Towards Data Science. Tianyuan Zheng Go to original source

August 2, 2025
“I think of analysts as data wizards who help their product teams solve problems”

“I think of analysts as data wizards who help their product teams solve problems” Mariya Mansurova explains how hands-on learning, agentic AI, and engineering habits shape her writing and work. The post “I think of analysts as data wizards who help their product teams solve problems” appeared first on Towards Data Science. TDS Editors Go…

August 2, 2025
When Models Stop Listening: How Feature Collapse Quietly Erodes Machine Learning Systems

When Models Stop Listening: How Feature Collapse Quietly Erodes Machine Learning Systems Models don’t just fail with noise; they fail in silence, by narrowing their attention to the point of fragility. The post When Models Stop Listening: How Feature Collapse Quietly Erodes Machine Learning Systems appeared first on Towards Data Science. Mahe Jabeen Abdul Go…

August 2, 2025
FastSAM for Image Segmentation Tasks — Explained Simply

FastSAM for Image Segmentation Tasks — Explained Simply Image segmentation is a popular task in computer vision, with the goal of partitioning an input image into multiple regions, where each region represents a separate object. Several classic approaches from the past involved taking a model backbone (e.g., U-Net) and fine-tuning it on specialized datasets. While…

August 1, 2025
How to Benchmark LLMs – ARC AGI 3

How to Benchmark LLMs – ARC AGI 3 Learn how to LLMs are benchmarked, and try out the newly released ARC AGI 3 The post How to Benchmark LLMs – ARC AGI 3 appeared first on Towards Data Science. Eivind Kjosbakken Go to original source

August 1, 2025
LLMs and Mental Health

LLMs and Mental Health Are LLMs good or bad for our mental health? It’s more complicated than that. The post LLMs and Mental Health appeared first on Towards Data Science. Stephanie Kirmer Go to original source

August 1, 2025
The ONLY Data Science Roadmap You Need to Get a Job

The ONLY Data Science Roadmap You Need to Get a Job Are you looking to become a data scientist and don’t know where to start? In this article, I want to provide you with a straightforward, no-nonsense learning roadmap that you can follow to break into the industry. By the end, you’ll finally have a clear…

August 1, 2025
The Misconception of Retraining: Why Model Refresh Isn’t Always the Fix

The Misconception of Retraining: Why Model Refresh Isn’t Always the Fix Retraining is easy; knowing when not to is the real challenge. In machine learning, performance drops are rarely about stale weights; they’re about misunderstood signals. The post The Misconception of Retraining: Why Model Refresh Isn’t Always the Fix appeared first on Towards Data Science.…

July 31, 2025
Confusion Matrix Made Simple: Accuracy, Precision, Recall & F1-Score

Confusion Matrix Made Simple: Accuracy, Precision, Recall & F1-Score How to evaluate classification models and understand which metric matters the most. The post Confusion Matrix Made Simple: Accuracy, Precision, Recall & F1-Score appeared first on Towards Data Science. Nikhil Dasari Go to original source

July 31, 2025
The Stanford Framework That Turns AI into Your PM Superpower

The Stanford Framework That Turns AI into Your PM Superpower A human-centric guide to AI automation for product managers. The post The Stanford Framework That Turns AI into Your PM Superpower appeared first on Towards Data Science. Rahul Vir Go to original source

July 29, 2025
When 50/50 Isn’t Optimal: Debunking Even Rebalancing

When 50/50 Isn’t Optimal: Debunking Even Rebalancing A new theory of class imbalance demonstrates that the optimal training imbalance in a binary problem is not 50% The post When 50/50 Isn’t Optimal: Debunking Even Rebalancing appeared first on Towards Data Science. Marco Baity-Jesi Go to original source

July 25, 2025
How Do Grayscale Images Affect Visual Anomaly Detection?

How Do Grayscale Images Affect Visual Anomaly Detection? A practical exploration focusing on performance and speed The post How Do Grayscale Images Affect Visual Anomaly Detection? appeared first on Towards Data Science. Aimira Baitieva Go to original source

July 25, 2025
Things I Wish I Had Known Before Starting ML

Things I Wish I Had Known Before Starting ML Part 1: Data, Sales Pitches, Bugs, and Breakthroughs The post Things I Wish I Had Known Before Starting ML appeared first on Towards Data Science. Pascal Janetzky Go to original source

July 23, 2025
From Rules to Relationships: How Machines Are Learning to Understand Each Other

From Rules to Relationships: How Machines Are Learning to Understand Each Other Using knowledge graphs to handle the unexpected in semantic communication The post From Rules to Relationships: How Machines Are Learning to Understand Each Other appeared first on Towards Data Science. Shireesh Kumar Singh Go to original source

July 23, 2025
How To Significantly Enhance LLMs by Leveraging Context Engineering

How To Significantly Enhance LLMs by Leveraging Context Engineering The benefits and practical aspects of context engineering for LLMs The post How To Significantly Enhance LLMs by Leveraging Context Engineering appeared first on Towards Data Science. Eivind Kjosbakken Go to original source

July 22, 2025
MCP Client Development with Streamlit: Build Your AI-Powered Web App

MCP Client Development with Streamlit: Build Your AI-Powered Web App MCP client development with Streamlit to enhance the tool calling capabilities of remote MCP servers, from setting up your development environment and securing API keys, handling user input, connecting to remote MCP servers, and displaying AI-generated responses. The post MCP Client Development with Streamlit: Build…

July 22, 2025
Advanced Topic Modeling with LLMs

Advanced Topic Modeling with LLMs A deep dive into topic modeling by leveraging representation models and generative AI with BERTopic The post Advanced Topic Modeling with LLMs appeared first on Towards Data Science. Alex Davis Go to original source

July 22, 2025
Exploratory Data Analysis: Gamma Spectroscopy in Python (Part 2)

Exploratory Data Analysis: Gamma Spectroscopy in Python (Part 2) Let’s observe the matter on the atomic level The post Exploratory Data Analysis: Gamma Spectroscopy in Python (Part 2) appeared first on Towards Data Science. Dmitrii Eliuseev Go to original source

July 19, 2025
Gain a Better Understanding of Computer Vision: Dynamic SOLO (SOLOv2) with TensorFlow

Gain a Better Understanding of Computer Vision: Dynamic SOLO (SOLOv2) with TensorFlow A practical approach to instance segmentation using SOLOv2 and TensorFlow The post Gain a Better Understanding of Computer Vision: Dynamic SOLO (SOLOv2) with TensorFlow appeared first on Towards Data Science. Pavel Timonin Go to original source

July 19, 2025
From Reactive to Predictive: Forecasting Network Congestion with Machine Learning and INT

From Reactive to Predictive: Forecasting Network Congestion with Machine Learning and INT Learn how machine learning can predict network congestion before it happens The post From Reactive to Predictive: Forecasting Network Congestion with Machine Learning and INT appeared first on Towards Data Science. Shireesh Kumar Singh Go to original source

July 19, 2025
Don’t Waste Your Labeled Anomalies: 3 Practical Strategies to Boost Anomaly Detection Performance

Don’t Waste Your Labeled Anomalies: 3 Practical Strategies to Boost Anomaly Detection Performance A few labels go a long way in anomaly detection The post Don’t Waste Your Labeled Anomalies: 3 Practical Strategies to Boost Anomaly Detection Performance appeared first on Towards Data Science. Shuai Guo Go to original source

July 18, 2025
The Power of Building from Scratch

The Power of Building from Scratch Mauro Di Pietro discusses building AI agents with open-source tools, bridging theory and practice, and why he’s still nostalgic for scikit-learn. The post The Power of Building from Scratch appeared first on Towards Data Science. TDS Editors Go to original source

July 17, 2025