Dynamic Programming and Reinforcement Learning (A.Y. 2024/25)

Contents

Dynamic Programming [RLOC - Ch. 1, 4], [S]

  • Introduction to Markov Decision Processes: Dynamic Programming and Reinforcement Learning

  • Review of probability calculus

  • Tools for the simulation of stochastic systems

  • Review of Markov chains

  • Formulation of Markov Decision Problems (MDPs)

  • Deterministic dynamic programming

  • Stochastic dynamic programming

  • Optimal stopping problems

  • Dynamic programming over infinite time horizon

  • Minimum time problems

Reinforcement Learning [RLI - Ch. 4, 5, 6, 13]

  • Approximate dynamic programming

  • Monte Carlo methods

  • On-policy and off-policy methods

  • Temporal difference methods

  • Q-Learning

  • SARSA

  • Expected SARSA

  • n-Step bootstrapping

  • Policy gradient

  • Overview of further topics: Approximate methods, Eligibility Traces

Throughout the course, exercises will be worked out by using small Python scripts

Prerequisites

  • Basic notions of probability and Markov Chains

  • Prior knowledge of Python is useful, although not required