Reinforcement Learning

Course topics

Module 1.
  • Bandit algorithms.
  • Markov Decision Problems and Dynamic Programming
  • Practice: programming of some bandit algorithms. Bandit algorithms for stock-picking.
Module 2.
  • Tabular methods (Montecarlo and Temporal Difference).
  • Practice: implement some of these methods in OpenAI Gym.
Module 3.
  • On-policy prediction and control with function approximation. Deep Reinforcement Learning.
  • Practice: OpenAI Gym (FrozenLake/MountainCar).
Module 4.
  • Policy Optimization / Policy gradients.
  • Practice: OpenAI Gym (Pong).
Module 5.
  • Two-player games. Evolutionary games.
  • Practice: Counterfactual Regret minimization. Evolutionary game theory.
Module 6.
  • Meta-learning
  • Learning through self-play


Про факультет

Важлива інформація

Контактна інформація