This class will be given during the second semester on Wednesday afternoon in Building, B37, Room S42. It starts at 1:45pm till 4:45pm. The first class takes place on the  9th of February.

The teaching assistants for the class are Samy Aittahar and Bardhyl Miftari.  You should contact them using the following email address: eb.eg1669861364eilu@16698613643008o1669861364fni1669861364 .


1. Lectures

Lesson 1. 09/02/2022. Speaker: Damien Ernst.  Introduction to Reinforcement Learning (RL). Understand how to build a RL agent for non-adversarial environment with discrete state/action spaces.

Lesson 2. 16/02/2022. Speaker: Damien Ernst.  The Q-learning algorithm (see Slides Lesson 1). Proof related to the upper bound on the suboptimality of \mu_T^*

Lesson 3. 23/02/2022. Speaker: Damien Ernst. Reinforcement learning for continuous state-action spaces (see Slides Lesson 1).  Discussion  Research paper 1.

Lesson 4. 02/03/2022. Speaker: Damien Ernst.  Advanced algorithms for learning Q-functions.

Lesson 5. 09/03/2022. Speaker: Damien Ernst. Discussion assignments, project and presentation Research paper 2.

Lesson 6. 16/03/2022. Speaker:  Adrien Bolland. Gradient-based techniques for reinforcement learning in continuous domains. Slides of the class

Paper to read for next time:

Lesson 7. 23/03/2022. Speaker:  Adrien Bolland. Gradient-based techniques for reinforcement learning in continuous domains. Slides of the class

Lesson 8. 30/03/2022. Speaker: Thibaut Théate. Distributional reinforcement learning.

Lesson 9. 20/04/2022. Speaker: Raphael Fonteneau. Advanced batch mode reinforcement learning. Discussion Research paper. 3.

Lesson 10. 27/04/2022. Speaker: Pascal Leroy. Multi-agent reinforcement learning.

Lesson 11. 05/05/2022. Speaker: Damien Ernst. Exploration/exploitation in Reinforcement Learning: The multi-armed bandit problems. Class based on a discussion of  Research paper 4  (first 25 pages).

2. Pratical sessions

Session 1. 09/02/2022 . Problem formalisation in discrete domain

Session 2. 16/02/2022 . MPD reconstruction

3. Assignments (by groups of two)

Assignment_1 – 09/02/2022. Section 1 to 4 need to be submitted for the 22/02/2022 midnight. Section 5 for the 29/02/2022 midnight.  Link to the submission platform:

Results Assignment 1

Assignment 2 –  02/03/2022. Section 1 to 4 need to be submitted for the 09/03/2022 midnight. Deadline for the final submission: 16/03/2022 midnight.

Results Assignment 2

Assignment 3 – 23/03/2022 (final assignment). Deadline for the final submission: 19/04/2022 midnight.  The assignment is not mandatory but you get a  bonus of +1 on the final note if you get more that 12/20, +2 on the final note get more that 14/20, +3 if more than 16/20 and +4 if you get more than 18/20

Granted Bonus Assignment 3

4. Final project

The final project will be about designing your own intelligent agent to control a double inverted pendulum – a well-known but challenging, chaotic physical system – or a complex energy network – with a large number of states and actions- using any reinforcement learning algorithm which is compatible with continuous state and action spaces.

Deadline for the final report is on the 17/05/2022 midnight

Network management : ANM6-Easy project

Robot equilibrium : Double Inverted Pendulum project

Results Project

5. Evaluations

There will be  a few  evaluations  during the semester and there will be an oral exam based on all the material we have seen during the class!

Evaluations that took place during the previous years:

Evaluation 1  ; Evaluation 2 ; Evaluation 3 Evaluation 4 ; Evaluation 5

6. Access your results 

Semester grades (without exam)

7. Final exam

Schedule for the final exam


Share on FacebookTweet about this on TwitterShare on LinkedIn