Category: reinforcement-learning

A Generalizable MARL-LP Approach for Scheduling in Logistics

A Generalizable MARL-LP Approach for Scheduling in Logistics Part 1. Hybrid Solution for Dynamic Vehicle Routing — Context and Architecture The post A Generalizable MARL-LP Approach for Scheduling in Logistics appeared first on Towards Data Science. Alexander Levin Go to original source

February 27, 2026
Routing in a Sparse Graph: a Distributed Q-Learning Approach

Routing in a Sparse Graph: a Distributed Q-Learning Approach Distributed agents need only decide one move ahead. The post Routing in a Sparse Graph: a Distributed Q-Learning Approach appeared first on Towards Data Science. Sébastien Gilbert Go to original source

February 4, 2026
Distributed Reinforcement Learning for Scalable High-Performance Policy Optimization

Distributed Reinforcement Learning for Scalable High-Performance Policy Optimization Leveraging massive parallelism, asynchronous updates, and multi-machine training to match and exceed human-level performance The post Distributed Reinforcement Learning for Scalable High-Performance Policy Optimization appeared first on Towards Data Science. Sam Black Go to original source

February 2, 2026
Deep Reinforcement Learning: The Actor-Critic Method

Deep Reinforcement Learning: The Actor-Critic Method Robot friends collaborate to learn to fly a drone The post Deep Reinforcement Learning: The Actor-Critic Method appeared first on Towards Data Science. Vedant Jumle Go to original source

January 2, 2026
Implementing Vibe Proving with Reinforcement Learning

Implementing Vibe Proving with Reinforcement Learning How to make LLMs reason with verifiable, step-by-step logic (Part 2) The post Implementing Vibe Proving with Reinforcement Learning appeared first on Towards Data Science. Jacopo Tagliabue Go to original source

December 30, 2025
The Reinforcement Learning Handbook: A Guide to Foundational Questions

The Reinforcement Learning Handbook: A Guide to Foundational Questions Simplifying all the concepts required to master reinforcement learning The post The Reinforcement Learning Handbook: A Guide to Foundational Questions appeared first on Towards Data Science. Avishek Biswas Go to original source

November 7, 2025
Train a Humanoid Robot with AI and Python

Train a Humanoid Robot with AI and Python 3D simulations and Reinforcement Learning with MuJoCo and Gym The post Train a Humanoid Robot with AI and Python appeared first on Towards Data Science. Mauro Di Pietro Go to original source

November 5, 2025
Deep Reinforcement Learning: 0 to 100

Deep Reinforcement Learning: 0 to 100 Using RL to teach robots to fly a drone The post Deep Reinforcement Learning: 0 to 100 appeared first on Towards Data Science. Vedant Jumle Go to original source

October 29, 2025
Temporal-Difference Learning and the Importance of Exploration: An Illustrated Guide

Temporal-Difference Learning and the Importance of Exploration: An Illustrated Guide Comparing model-free and model-based RL methods on a dynamic grid world The post Temporal-Difference Learning and the Importance of Exploration: An Illustrated Guide appeared first on Towards Data Science. Ryan Pégoud Go to original source

October 2, 2025
Exploring Prompt Learning: Using English Feedback to Optimize LLM Systems

Exploring Prompt Learning: Using English Feedback to Optimize LLM Systems Prompt learning presents a compelling approach for continuous improvement of AI applications The post Exploring Prompt Learning: Using English Feedback to Optimize LLM Systems appeared first on Towards Data Science. Aparna Dhinakaran Go to original source

July 17, 2025
Dynamic Inventory Optimization with Censored Demand

Dynamic Inventory Optimization with Censored Demand A sequential decision framework with Bayesian learning The post Dynamic Inventory Optimization with Censored Demand appeared first on Towards Data Science. Mert Ersoz Go to original source

July 15, 2025
How to Fine-Tune Small Language Models to Think with Reinforcement Learning

How to Fine-Tune Small Language Models to Think with Reinforcement Learning A visual tour and from-scratch guide to train GRPO reasoning models in PyTorch The post How to Fine-Tune Small Language Models to Think with Reinforcement Learning appeared first on Towards Data Science. Avishek Biswas Go to original source

July 9, 2025
Revisiting Benchmarking of Tabular Reinforcement Learning Methods

Revisiting Benchmarking of Tabular Reinforcement Learning Methods Introducing a modular framework and improving model performance. The post Revisiting Benchmarking of Tabular Reinforcement Learning Methods appeared first on Towards Data Science. Oliver S Go to original source

July 2, 2025
Reinforcement Learning Made Simple: Build a Q-Learning Agent in Python

Reinforcement Learning Made Simple: Build a Q-Learning Agent in Python Inspired by AlphaGo’s Move 37 — learn how agents explore, exploit, and win The post Reinforcement Learning Made Simple: Build a Q-Learning Agent in Python appeared first on Towards Data Science. Sarah Schürch Go to original source

May 28, 2025
Beyond Glorified Curve Fitting: Exploring the Probabilistic Foundations of Machine Learning

Beyond Glorified Curve Fitting: Exploring the Probabilistic Foundations of Machine Learning You see a math formula you don’t immediately understand. Your instinct? Stop reading. Don’t. That’s exactly what I told myself when I started reading Probabilistic Machine Learning – An Introduction by Kevin P. Murphy. And it was absolutely worth it. It changed how I…

May 1, 2025
A Step-By-Step Guide To Powering Your Application With LLMs

A Step-By-Step Guide To Powering Your Application With LLMs You might be wondering whether GenAI is just hype or external noise. I also thought this was hype, and I could sit this one out until the dust cleared. Oh, boy, was I wrong. GenAI has real-world applications. It also generates revenue for companies, so we expect…

April 26, 2025
How Recurrent Neural Networks (RNNs) Are Revolutionizing Decision-Making Research

How Recurrent Neural Networks (RNNs) Are Revolutionizing Decision-Making Research A deep dive into the world of computational modeling and its applications Continue reading on Towards Data Science » Kaushik Rajan Go to original source

January 8, 2025
Introducing n-Step Temporal-Difference Methods

Introducing n-Step Temporal-Difference Methods Dissecting “Reinforcement Learning” by Richard S. Sutton with custom Python implementations, Episode V Continue reading on Towards Data Science » Oliver S Go to original source

December 30, 2024
Understanding the Mathematics of PPO in Reinforcement Learning

Understanding the Mathematics of PPO in Reinforcement Learning Deep dive into RL with PPO for beginners Photo by ThisisEngineering on Unsplash Introduction Reinforcement Learning (RL) is a branch of Artificial Intelligence that enables agents to learn how to interact with their environment. These agents, which range from robots to software features or autonomous systems, learn through…

December 27, 2024
Navigating Soft Actor-Critic Reinforcement Learning

Navigating Soft Actor-Critic Reinforcement Learning Understanding the theory and implementation of SAC RL in the context of Bioengineering Image generated by the author using ChatGPT-4o Introduction The research domain of Reinforcement Learning (RL) has evolved greatly over the past years. The use of deep reinforcement learning methods such as Proximal Policy Optimisation (PPO) (Schulman, 2017)…

December 19, 2024
Reinforcement Learning: Self-Driving Cars to Self-Driving Labs

Reinforcement Learning: Self-Driving Cars to Self-Driving Labs Understanding AI applications in bio for machine learning engineers Photo by Ousa Chea on Unsplash Anyone who has tried teaching a dog new tricks knows the basics of reinforcement learning. We can modify the dog’s behavior by repeatedly offering rewards for obedience and punishments for misbehavior. In reinforcement learning…

December 7, 2024