This class will be given during the second semester on Wednesday afternoon in Room S36 of Building B37. It starts at 2:00pm. The first class is on the 10 th of February.  The teaching assistant for the class is Mr. Samy Aittahar.

Students should come to the class with their laptop.

1. Lectures

Lesson 1. 05/02/2020. Speaker: Damien Ernst.  Introduction to Reinforcement Rearning (RL). Understand how to build a RL agent for non-adversarial environment with discrete state/action spaces.

Lesson 2. 12/02/2020. Speaker: Damien Ernst.  The Q-learning algorithm (see Slides Lesson 1).

Lesson 3. 19/02/2020. Speaker: Damien Ernst. Reinforcement learning for continuous state-action spaces (see Slides Lesson 1).

Lesson 4. 26/02/2020. Speaker: Damien Ernst. Discussion  Research paper 1.

Lesson 5. 04/03/2020. Speaker: Raphael Fonteneau. Advanced batch mode reinforcement learning. Discussion Research paper 2.

Lesson 6. 11/03/2020. Speaker: Damien Ernst.  Advanced algorithms for learning Q-functions. Discussion Research paper 3.

=== The program hereafter will be modified due to the coronavirus crisis ===

Lesson 7. 18/03/2020. Speaker:  Samy Aittahar. Introduction to Gradient-based Policy Search.   Discussion Research paper 4 , Research paper 5 , Research paper 6 .

Lesson 8. 25/03/2020. Speaker: Nicolas Vecoven, Pascal Leroy, Amina  Benzerga. A glimpse at the research done in RL at the Montefiore Institute. Discussion Research paper 7 .

Lesson 9. 01/04/2020. Reinforcement learning in the energy industry. Real applications. Bert Claessens (Restore).

Lesson 10. 22/04/2020. Speaker: Damien Ernst. Exploration/exploitation in Reinforcement Learning: The multi-armed bandit problems. Class based on a discussion of  Research paper 9 (first 25 pages).



2. Assignments

Assignment 1 – 05/02/2020. Section 1 to 4 need to be submitted for the 11/02/2020 midnight. Section 5 and 6 for the 18/02/2020.  Link to the submission platform:

Assignment 2 – 26/02/2020. Section 1 to 4 need to be submitted for the 06/03/2020 midnight. Deadline for the final submission: 26/03/2020 midnight.

Assignment 3 – 17/03/2020 (final assignment). Deadline for the final submission: 15/05/2020.  Assignment 3 is  not mandatory. But based on the quality of the work, you will get a bonus.

3. Final project

The final project will be about designing your own intelligent agent to control a double inverted pendulum – a well-known but challenging, chaotic physical system – using any reinforcement learning algorithm which is compatible with continuous state and action spaces.

Deadline for the final report is on the 15/05/2020. Presentation of your results: to be disucssed.

4. Evaluations

There will be a total of 5  evaluations.

Evaluation 1 

Evaluation 2

Evaluation 3

Evaluation 4

Evaluation 5

