Q-Learning: Basics and Modern Extensions (Proseminar)
organizers
co-organizers
Jasper Hoffmann, Yuan Zhang, Branka Mirchevska, and Moritz Schneider
Description
In this proseminar, we will introduce the basics of Dynamic Programming, Reinforcement Learning (RL), and Q-Learning in particular. We will cover the classic Q-Learning algorithm, one of the most successful algorithms in the history of RL, and study several modern extensions. In addition, we will introduce general aspects of scientific presentations and academic honesty.
Course information
Details: |
Course number: 11LE13S-510-27
place: 10
Zoom session details (can be used for all meeting if you cannot attend in person):
https://uni-freiburg.zoom.us/j/63820890181?pwd=UEdnR2xnUFkzSzZXOEY1K1ZxbWlLQT09 Meeting ID: 638 2089 0181 Lectures on 03/06/2022 and 17/06/2022 will be in hybrid format. Attention: room change ! Those wanting to join in person should come to meeting room R42 on the second floor (1st floor) of the new IMBIT building (Georges-Köhler-Allee 201). This is right next to the kitchen area. |
Course Schedule: |
Introduction : 29/04/2022, 13:00h
Further sessions:
Other important dates:
|
Requirements: | No prior knowledge of reinforcement learning is necessary. We will introduce the basics to get you up to speed so that you will be able to understand the assigned papers (with the help of your supervisor). |
Remarks: |
Due to the ongoing pandemic, we will offer the seminar in a hybrid format (see above for a zoom link).
|
Topics
- Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method (NFQ)
- Human-level Control Through Deep Reinforcement Learning (DQN)
- Prioritized Experience Replay (PER)
- Dueling Network Architectures for Deep Reinforcement Learning (Dueling DQN)
- Deep Reinforcement Learning with Double Q-learning (Double DQN)
- Integrated Modeling and Control based on Reinforcement Learning and Dynamic Programming (Dyna-Q)
- Continuous Control With Deep Reinforcement Learning (DDPG)
- Noisy Networks for Exploration (Noisy Nets)
- Addressing Function Approximation Error in Actor-Critic Methods (TD3)