Reinforcement Learning

Graduate course, Université de Lille, Computer science, 2021

The main reference for this course may be found on Philippe Preux’s website.

Practical session 1

Randomness with computers

Exercises

Solution

Practical session 2

Bellman equations & planning

Exercise

Reminder

Check your environment

Small gym tour

hint question 3

Evaluation

Below is a checklist for students to use and revise the concepts that were tackled in class.

Control theory

  • Definition of Markov Decision Process
  • Markov property
  • Discount factor / Discounted reward
  • Discounted value, Finite time horizon value
  • Bellman operator, Bellman optimal operator
  • Dynamic Programming principle
  • Policy Evaluation: Direct computation, Iteration, Monte-Carlo
  • Value Iteration
  • Policy Iteration