Category: Neural Network
-
YOLOv3 Paper Walkthrough: Even Better, But Not That Much
YOLOv3 Paper Walkthrough: Even Better, But Not That Much A PyTorch implementation on the YOLOv3 architecture from scratch The post YOLOv3 Paper Walkthrough: Even Better, But Not That Much appeared first on Towards Data Science. Muhammad Ardi Go to original source
-
Mechanistic Interpretability: Peeking Inside an LLM
Mechanistic Interpretability: Peeking Inside an LLM Are the human-like cognitive abilities of LLMs real or fake? How does information travel through the neural network? Is there hidden knowledge inside an LLM? The post Mechanistic Interpretability: Peeking Inside an LLM appeared first on Towards Data Science. Julian Mendel Go to original source
-
Teaching a Neural Network the Mandelbrot Set
Teaching a Neural Network the Mandelbrot Set And why Fourier features change everything The post Teaching a Neural Network the Mandelbrot Set appeared first on Towards Data Science. Carlos Redondo Go to original source
-
YOLOv1 Loss Function Walkthrough: Regression for All
YOLOv1 Loss Function Walkthrough: Regression for All An explanation of how YOLOv1 measures the correctness of its object detection and classification predictions The post YOLOv1 Loss Function Walkthrough: Regression for All appeared first on Towards Data Science. Muhammad Ardi Go to original source
-
The Machine Learning “Advent Calendar” Day 18: Neural Network Classifier in Excel
The Machine Learning “Advent Calendar” Day 18: Neural Network Classifier in Excel Understanding forward propagation and backpropagation through explicit formulas The post The Machine Learning “Advent Calendar” Day 18: Neural Network Classifier in Excel appeared first on Towards Data Science. angela shi Go to original source
-
The Machine Learning “Advent Calendar” Day 17: Neural Network Regressor in Excel
The Machine Learning “Advent Calendar” Day 17: Neural Network Regressor in Excel Neural networks often feel like black boxes. In this article, we build a neural network regressor from scratch using only Excel formulas. By making every step explicit, from forward propagation to backpropagation, we show how a neural network learns to approximate non-linear functions…
-
Neural Networks Are Blurry, Symbolic Systems Are Fragmented. Sparse Autoencoders Help Us Combine Them.
Neural Networks Are Blurry, Symbolic Systems Are Fragmented. Sparse Autoencoders Help Us Combine Them. Neural and symbolic models compress the world in fundamentally different ways, and Sparse Autoencoders (SAEs) offer a bridge to connect them. The post Neural Networks Are Blurry, Symbolic Systems Are Fragmented. Sparse Autoencoders Help Us Combine Them. appeared first on Towards…
-
Learning Triton One Kernel at a Time: Softmax
Learning Triton One Kernel at a Time: Softmax All you need to know about a fast, readable and PyTorch-ready softmax kernel The post Learning Triton One Kernel at a Time: Softmax appeared first on Towards Data Science. Ryan Pégoud Go to original source
-
I Measured Neural Network Training Every 5 Steps for 10,000 Iterations
I Measured Neural Network Training Every 5 Steps for 10,000 Iterations Image by Pixabay.com The post I Measured Neural Network Training Every 5 Steps for 10,000 Iterations appeared first on Towards Data Science. Javier Marin Go to original source
-
MobileNetV3 Paper Walkthrough: The Tiny Giant Getting Even Smarter
MobileNetV3 Paper Walkthrough: The Tiny Giant Getting Even Smarter MobileNetV3 with PyTorch — now featuring SE blocks and hard activation functions The post MobileNetV3 Paper Walkthrough: The Tiny Giant Getting Even Smarter appeared first on Towards Data Science. Muhammad Ardi Go to original source
-
MobileNetV2 Paper Walkthrough: The Smarter Tiny Giant
MobileNetV2 Paper Walkthrough: The Smarter Tiny Giant Understanding and implementing MobileNetV2 with PyTorch — the next generation of MobileNetV1 The post MobileNetV2 Paper Walkthrough: The Smarter Tiny Giant appeared first on Towards Data Science. Muhammad Ardi Go to original source
-
PyTorch Explained: From Automatic Differentiation to Training Custom Neural Networks
PyTorch Explained: From Automatic Differentiation to Training Custom Neural Networks Deep learning is shaping our world as we speak. In fact, it has been slowly revolutionizing software since the early 2010s. In 2025, PyTorch is at the forefront of this revolution, emerging as one of the most important libraries to train neural networks. Whether you…
-
Estimating from No Data: Deriving a Continuous Score from Categories
Estimating from No Data: Deriving a Continuous Score from Categories A walk-through of and the maths behind using low-capacity networks to acquire fine-grained scoring when only categorical labelling is available for training. We use it to predict the severity of an infection on a scale based on information on just rough outcomes in previous cases.…
-
From Genes to Neural Networks: Understanding and Building NEAT (Neuro-Evolution of Augmenting Topologies) from Scratch
From Genes to Neural Networks: Understanding and Building NEAT (Neuro-Evolution of Augmenting Topologies) from Scratch Practical Neuroevolution: Reproducing NEAT’s Innovations and Code Walkthrough The post From Genes to Neural Networks: Understanding and Building NEAT (Neuro-Evolution of Augmenting Topologies) from Scratch appeared first on Towards Data Science. Carlos Redondo Go to original source
-
The Channel-Wise Attention | Squeeze and Excitation
The Channel-Wise Attention | Squeeze and Excitation Applying the Squeeze and Excitation module on ResNeXt using PyTorch The post The Channel-Wise Attention | Squeeze and Excitation appeared first on Towards Data Science. Muhammad Ardi Go to original source
-
Taking ResNet to the Next Level
Taking ResNet to the Next Level Understanding how ResNeXt improves upon ResNet, with a comprehensive PyTorch implementation guide The post Taking ResNet to the Next Level appeared first on Towards Data Science. Muhammad Ardi Go to original source
-
Vision Transformer on a Budget
Vision Transformer on a Budget Introduction The vanilla ViT is problematic. If you take a look at the original ViT paper [1], you’ll notice that although this deep learning model proved to work extremely well, it requires hundreds of millions of labeled training images to achieve this. Well, that’s a lot. This requirement of an enormous…
-
The CNN That Challenges ViT
The CNN That Challenges ViT Introduction The invention of ViT (Vision Transformer) causes us to think that CNNs are obsolete. But is this really true? It is widely believed that the impressive performance of ViT comes primarily from its transformer-based architecture. However, researchers from Meta argued that it’s not entirely true. If we take a closer…
-
Why Are Convolutional Neural Networks Great For Images?
Why Are Convolutional Neural Networks Great For Images? The Universal Approximation Theorem states that a neural network with a single hidden layer and a nonlinear activation function can approximate any continuous function. Practical issues aside, such that the number of neurons in this hidden layer would grow enormously large, we do not need other network architectures. A simple…
-
Circuit Tracing: A Step Closer to Understanding Large Language Models
Circuit Tracing: A Step Closer to Understanding Large Language Models Context Over the years, Transformer-based large language models (LLMs) have made substantial progress across a wide range of tasks evolving from simple information retrieval systems to sophisticated agents capable of coding, writing, conducting research, and much more. But despite their capabilities, these models are still largely…
-
Attractors in Neural Network Circuits: Beauty and Chaos
Attractors in Neural Network Circuits: Beauty and Chaos The state space of the first two neuron activations over time follows an attractor. What is one thing in common between memories, oscillating chemical reactions and double pendulums? All these systems have a basin of attraction for possible states, like a magnet that draws the system towards certain…
-
How LLMs Work: Pre-Training to Post-Training, Neural Networks, Hallucinations, and Inference
How LLMs Work: Pre-Training to Post-Training, Neural Networks, Hallucinations, and Inference With the recent explosion of interest in large language models (LLMs), they often seem almost magical. But let’s demystify them. I wanted to step back and unpack the fundamentals — breaking down how LLMs are built, trained, and fine-tuned to become the AI systems we interact…