Category: Neural Network

YOLOv3 Paper Walkthrough: Even Better, But Not That Much

YOLOv3 Paper Walkthrough: Even Better, But Not That Much A PyTorch implementation on the YOLOv3 architecture from scratch The post YOLOv3 Paper Walkthrough: Even Better, But Not That Much appeared first on Towards Data Science. Muhammad Ardi Go to original source

March 3, 2026
Mechanistic Interpretability: Peeking Inside an LLM

Mechanistic Interpretability: Peeking Inside an LLM Are the human-like cognitive abilities of LLMs real or fake? How does information travel through the neural network? Is there hidden knowledge inside an LLM? The post Mechanistic Interpretability: Peeking Inside an LLM appeared first on Towards Data Science. Julian Mendel Go to original source

February 6, 2026
Teaching a Neural Network the Mandelbrot Set

Teaching a Neural Network the Mandelbrot Set And why Fourier features change everything The post Teaching a Neural Network the Mandelbrot Set appeared first on Towards Data Science. Carlos Redondo Go to original source

January 10, 2026
YOLOv1 Loss Function Walkthrough: Regression for All

YOLOv1 Loss Function Walkthrough: Regression for All An explanation of how YOLOv1 measures the correctness of its object detection and classification predictions The post YOLOv1 Loss Function Walkthrough: Regression for All appeared first on Towards Data Science. Muhammad Ardi Go to original source

January 6, 2026
The Machine Learning “Advent Calendar” Day 18: Neural Network Classifier in Excel

The Machine Learning “Advent Calendar” Day 18: Neural Network Classifier in Excel Understanding forward propagation and backpropagation through explicit formulas The post The Machine Learning “Advent Calendar” Day 18: Neural Network Classifier in Excel appeared first on Towards Data Science. angela shi Go to original source

December 19, 2025
The Machine Learning “Advent Calendar” Day 17: Neural Network Regressor in Excel

The Machine Learning “Advent Calendar” Day 17: Neural Network Regressor in Excel Neural networks often feel like black boxes. In this article, we build a neural network regressor from scratch using only Excel formulas. By making every step explicit, from forward propagation to backpropagation, we show how a neural network learns to approximate non-linear functions…

December 18, 2025
Neural Networks Are Blurry, Symbolic Systems Are Fragmented. Sparse Autoencoders Help Us Combine Them.

Neural Networks Are Blurry, Symbolic Systems Are Fragmented. Sparse Autoencoders Help Us Combine Them. Neural and symbolic models compress the world in fundamentally different ways, and Sparse Autoencoders (SAEs) offer a bridge to connect them. The post Neural Networks Are Blurry, Symbolic Systems Are Fragmented. Sparse Autoencoders Help Us Combine Them. appeared first on Towards…

November 28, 2025
Learning Triton One Kernel at a Time: Softmax

Learning Triton One Kernel at a Time: Softmax All you need to know about a fast, readable and PyTorch-ready softmax kernel The post Learning Triton One Kernel at a Time: Softmax appeared first on Towards Data Science. Ryan Pégoud Go to original source

November 24, 2025
I Measured Neural Network Training Every 5 Steps for 10,000 Iterations

I Measured Neural Network Training Every 5 Steps for 10,000 Iterations Image by Pixabay.com The post I Measured Neural Network Training Every 5 Steps for 10,000 Iterations appeared first on Towards Data Science. Javier Marin Go to original source

November 16, 2025
MobileNetV3 Paper Walkthrough: The Tiny Giant Getting Even Smarter

MobileNetV3 Paper Walkthrough: The Tiny Giant Getting Even Smarter MobileNetV3 with PyTorch — now featuring SE blocks and hard activation functions The post MobileNetV3 Paper Walkthrough: The Tiny Giant Getting Even Smarter appeared first on Towards Data Science. Muhammad Ardi Go to original source

November 3, 2025
MobileNetV2 Paper Walkthrough: The Smarter Tiny Giant

MobileNetV2 Paper Walkthrough: The Smarter Tiny Giant Understanding and implementing MobileNetV2 with PyTorch — the next generation of MobileNetV1 The post MobileNetV2 Paper Walkthrough: The Smarter Tiny Giant appeared first on Towards Data Science. Muhammad Ardi Go to original source

October 4, 2025
PyTorch Explained: From Automatic Differentiation to Training Custom Neural Networks

PyTorch Explained: From Automatic Differentiation to Training Custom Neural Networks Deep learning is shaping our world as we speak. In fact, it has been slowly revolutionizing software since the early 2010s. In 2025, PyTorch is at the forefront of this revolution, emerging as one of the most important libraries to train neural networks. Whether you…

September 25, 2025
Estimating from No Data: Deriving a Continuous Score from Categories

Estimating from No Data: Deriving a Continuous Score from Categories A walk-through of and the maths behind using low-capacity networks to acquire fine-grained scoring when only categorical labelling is available for training. We use it to predict the severity of an infection on a scale based on information on just rough outcomes in previous cases.…

August 12, 2025
From Genes to Neural Networks: Understanding and Building NEAT (Neuro-Evolution of Augmenting Topologies) from Scratch

From Genes to Neural Networks: Understanding and Building NEAT (Neuro-Evolution of Augmenting Topologies) from Scratch Practical Neuroevolution: Reproducing NEAT’s Innovations and Code Walkthrough The post From Genes to Neural Networks: Understanding and Building NEAT (Neuro-Evolution of Augmenting Topologies) from Scratch appeared first on Towards Data Science. Carlos Redondo Go to original source

August 12, 2025
The Channel-Wise Attention | Squeeze and Excitation

The Channel-Wise Attention | Squeeze and Excitation Applying the Squeeze and Excitation module on ResNeXt using PyTorch The post The Channel-Wise Attention | Squeeze and Excitation appeared first on Towards Data Science. Muhammad Ardi Go to original source

August 8, 2025
Taking ResNet to the Next Level

Taking ResNet to the Next Level Understanding how ResNeXt improves upon ResNet, with a comprehensive PyTorch implementation guide The post Taking ResNet to the Next Level appeared first on Towards Data Science. Muhammad Ardi Go to original source

July 3, 2025
Vision Transformer on a Budget

Vision Transformer on a Budget Introduction The vanilla ViT is problematic. If you take a look at the original ViT paper [1], you’ll notice that although this deep learning model proved to work extremely well, it requires hundreds of millions of labeled training images to achieve this. Well, that’s a lot. This requirement of an enormous…

June 3, 2025
The CNN That Challenges ViT

The CNN That Challenges ViT Introduction The invention of ViT (Vision Transformer) causes us to think that CNNs are obsolete. But is this really true? It is widely believed that the impressive performance of ViT comes primarily from its transformer-based architecture. However, researchers from Meta argued that it’s not entirely true. If we take a closer…

May 6, 2025
Why Are Convolutional Neural Networks Great For Images?

Why Are Convolutional Neural Networks Great For Images? The Universal Approximation Theorem states that a neural network with a single hidden layer and a nonlinear activation function can approximate any continuous function. Practical issues aside, such that the number of neurons in this hidden layer would grow enormously large, we do not need other network architectures. A simple…

May 1, 2025
Circuit Tracing: A Step Closer to Understanding Large Language Models

Circuit Tracing: A Step Closer to Understanding Large Language Models Context Over the years, Transformer-based large language models (LLMs) have made substantial progress across a wide range of tasks evolving from simple information retrieval systems to sophisticated agents capable of coding, writing, conducting research, and much more. But despite their capabilities, these models are still largely…

April 9, 2025
Attractors in Neural Network Circuits: Beauty and Chaos

Attractors in Neural Network Circuits: Beauty and Chaos The state space of the first two neuron activations over time follows an attractor. What is one thing in common between memories, oscillating chemical reactions and double pendulums? All these systems have a basin of attraction for possible states, like a magnet that draws the system towards certain…

March 26, 2025
How LLMs Work: Pre-Training to Post-Training, Neural Networks, Hallucinations, and Inference

How LLMs Work: Pre-Training to Post-Training, Neural Networks, Hallucinations, and Inference With the recent explosion of interest in large language models (LLMs), they often seem almost magical. But let’s demystify them. I wanted to step back and unpack the fundamentals — breaking down how LLMs are built, trained, and fine-tuned to become the AI systems we interact…

February 19, 2025