This class will be given during the second semester on Wednesday afternoon. It starts at 1:45pm. The first class takes place on the 3rd of February. Given the COVID-19 crisis, the class will be given on-line using Webex.

The teaching assistants for the class are Samy Aittahar and Bardhyl Miftari.


1. Lectures

Lesson 1. 03/02/2021. Speaker: Damien Ernst.  Introduction to Reinforcement Learning (RL). Understand how to build a RL agent for non-adversarial environment with discrete state/action spaces.  Podcast lesson 1

Lesson 2. 10/02/2021. Speaker: Damien Ernst.  The Q-learning algorithm (see Slides Lesson 1). Proof related to the upper bound on the suboptimality of \mu_T^* Podcast Lesson 2

Lesson 3. 17/02/2021. Speaker: Damien Ernst. Reinforcement learning for continuous state-action spaces (see Slides Lesson 1).  Discussion  Research paper 1. Podcast Lesson 3

Lesson 4. 24/02/2021. Speaker: Damien Ernst.  Advanced algorithms for learning Q-functions.

Lesson 5. 03/03/2021. Speaker: Damien Ernst. Discussion assignments, project and presentation Research paper 2.

Lesson 6. 10/03/2021. Speaker:  Adrien Bolland. Gradient-based techniques for reinforcement learning in continuous domains.

Lesson 7. 17/03/2021. Speaker:  Adrien Bolland. Gradient-based techniques for reinforcement learning in continuous domains.

Lesson 8. 24/03/2021. Speaker: Thibaut Théate. Distributed reinforcement learning.

Lesson 9. 31/03/2021. Speaker: Raphael Fonteneau. Advanced batch mode reinforcement learning. Discussion Research paper. 3.

Lesson 10. 21/04/2021. Speaker: Pascal Leroy. Multi-agent reinforcement learning.

Lesson 11. 28/04/2021. Speaker: Damien Ernst. Exploration/exploitation in Reinforcement Learning: The multi-armed bandit problems. Class based on a discussion of  Research paper 9 (first 25 pages). (Class not yet confirmed).

2. Assignments

Assignment 1 – 03/02/2021. Section 1 to 4 need to be submitted for the 09/02/2021 midnight. Section 5 and 6 for the 16/02/2021 midnight.

Assignment 2 –  17/02/2021. Section 1 to 4 need to be submitted for the 23/02/2021 midnight. Deadline for the final submission: 02/03/2021 midnight.

Assignment 3 – 17/03/2021 (final assignment). Deadline for the final submission: 27/04/2021.

3. Final project

The final project will be about designing your own intelligent agent to control a double inverted pendulum – a well-known but challenging, chaotic physical system – using any reinforcement learning algorithm which is compatible with continuous state and action spaces.

Deadline for the final report is on the 14/05/2021 midnight

4. Evaluations

Due to the COVD-19 crisis, the 5 evaluations that usually took place during the semester have been cancelled. But there will be an oral exam based on all the material we have seen during the class!

6. Final exam 

