Reinforcement Learning | Towards Data Science

Dynamic Inventory Optimization with Censored Demand

Data Science

A sequential decision framework with Bayesian learning

Mert Ersoz

July 14, 2025

20 min read

How to Fine-Tune Small Language Models to Think with Reinforcement Learning

Large Language Models

A visual tour and from-scratch guide to train GRPO reasoning models in PyTorch

Avishek Biswas

July 8, 2025

23 min read

Revisiting Benchmarking of Tabular Reinforcement Learning Methods

Machine Learning

Introducing a modular framework and improving model performance.

Oliver S

July 1, 2025

9 min read

Reinforcement Learning Made Simple: Build a Q-Learning Agent in Python

Artificial Intelligence

Inspired by AlphaGo’s Move 37 — learn how agents explore, exploit, and win

Sarah Schürch

May 27, 2025

11 min read

Demystifying Policy Optimization in RL: An Introduction to PPO and GRPO

Deep Learning

A beginner-friendly guide to PPO and GRPO: simplifying policy optimization in reinforcement learning

Joshua Nishanth A

May 26, 2025

16 min read

Benchmarking Tabular Reinforcement Learning Algorithms

Machine Learning

Comparing all methods from Part I of Sutton’s book on gridworld environments

Oliver S

May 6, 2025

27 min read

Beyond Glorified Curve Fitting: Exploring the Probabilistic Foundations of Machine Learning

Machine Learning

An introduction to probabilistic thinking — and why it’s the foundation for robust and explainable…

Sarah Schürch

April 30, 2025

12 min read

Reinforcement Learning from One Example?

Machine Learning

Why 1-shot RLVR might be the breakthrough we’ve been waiting for

Derrick Mwiti

April 30, 2025

4 min read

A Step-By-Step Guide To Powering Your Application With LLMs

Large Language Models

Explore a hands-on guide to integrating large language models into real-world apps, not just read…

Prasann Pradeep Patil

April 25, 2025

8 min read

Normalized value function surfaces for usable (red) and non-usable ace (blue) scenarios.

Why Normalization Is Crucial for Policy Evaluation in Reinforcement Learning

Machine Learning

Enhancing Accuracy in Reinforcement Learning Policy Evaluation through Normalization

Lukasz Gatarek

January 14, 2025

6 min read