Reinforcement Learning

Course topics

Module 1.

  • Bandit algorithms.
  • Markov Decision Problems and Dynamic Programming
  • Practice: programming of some bandit algorithms. Bandit algorithms for stock-picking.

Module 2.

  • Tabular methods (Montecarlo and Temporal Difference).
  • Practice: implement some of these methods in OpenAI Gym.

Module 3.

  • On-policy prediction and control with function approximation. Deep Reinforcement Learning.
  • Practice: OpenAI Gym (FrozenLake/MountainCar).

Module 4.

  • Policy Optimization / Policy gradients.
  • Practice: OpenAI Gym (Pong).

Module 5.

  • Two-player games. Evolutionary games.
  • Practice: Counterfactual Regret minimization. Evolutionary game theory.

Module 6.

  • Meta-learning
  • Learning through self-play