Reinforcement Learning, Winter, 2018
In this course, we will present the main ideas to get you started in reinforcement learning. By the end of the course, you should be able to start applying it in practice, and/or follow the latest trends in this born-again field.
We’ll follow roughly the book “Sutton & Barto, Introduction to Reinforcement Learning 2017”, with add-ons from other sources (e.g. research papers).
– Bandit algorithms.
– Markov Decision Problems and Dynamic Programming
– Practice: programming of some bandit algorithms. Bandit algorithms for stock-picking.
– Tabular methods (Montecarlo and Temporal Difference).
– Practice: implement some of these methods in OpenAI Gym.
– On-policy prediction and control with function approximation. Deep Reinforcement. Learning.
– Practice: OpenAI Gym (CartPole).
– Policy Optimization / Policy gradients.
– Practice: OpenAI Gym (Pong).
– Two-player games. Evolutionary games.
– Practice: Counterfactual Regret minimization. Evolutionary game theory.
– Learning through self-play
Python, Statistics, Machine Learning basics, Neural Networks (good to know).
PhD. Juan Pablo Maldonado Lopez
Pablo Maldonado is an applied mathematician, data scientist consultant, and lecturer at the Czech Technical University in Prague. He has collaborated with global organizations and small companies alike, and is a regular invited speaker in academic conferences in both sides of the Atlantic. His professional interests are around strategic decision making under uncertainty, both in theory and applicaionts.