Temporal Difference Learning (including Q-Learning) | Reinforcement Learning Part 4

Опубликовано: 27 Октябрь 2022
на канале: Mutual Information

36,881

999

The machine learning consultancy: https://truetheta.io
Join my email list to get educational and useful articles (and nothing else!): https://mailchi.mp/truetheta/true-the...
Want to work together? See here: https://truetheta.io/about/#want-to-w...

Part four of a six part series on Reinforcement Learning. As the title says, it covers Temporal Difference Learning, Sarsa and Q-Learning, along with some examples.

SOCIAL MEDIA

LinkedIn :   / dj-rich-90b91753
Twitter :   / duanejrich
Github: https://github.com/Duane321

Enjoy learning this way? Want me to make more videos? Consider supporting me on Patreon:   / mutualinformation

SOURCES

[1] R. Sutton and A. Barto. Reinforcement learning: An Introduction (2nd Ed). MIT Press, 2018.

[2] H. Hasselt, et al. RL Lecture Series, Deepmind and UCL, 2021,    • DeepMind x UCL | Deep Learning Lectur...

SOURCE NOTES

The video covers topics from chapters 6 and 7 from [1]. The whole series teaches from [1]. [2] has been a useful secondary resource.

TIMESTAMP
0:00 What We'll Learn
0:52 No Review
1:18 TD as an Adjusted Version of MC
2:49 TD Visualized with a Markov Reward Process
6:34 N-Step Temporal Difference Learning
8:08 MC vs TD on an Evaluation Example
11:50 TD's Trade-Off between N and Alpha
12:47 Why does TD Perform Better than MC?
15:29 N-Step Sarsa
17:15 Why have N above 1?
19:02 Q-Learning
20:50 Expected Sarsa
21:48 Cliff Walking
25:04 Windy GridWorld
28:12 Watch the Next Video!

NOTES

Code to compare TD vs MC on the evaluation task: https://github.com/Duane321/mutual_in...

Смотрите видео Temporal Difference Learning (including Q-Learning) | Reinforcement Learning Part 4 онлайн без регистрации, длительностью часов минут секунд в хорошем качестве. Это видео добавил пользователь Mutual Information 27 Октябрь 2022, не забудьте поделиться им ссылкой с друзьями и знакомыми, на нашем сайте его посмотрели 36,881 раз и оно понравилось 999 людям.

11,050