Temporal Difference Learning (including Q-Learning) | Reinforcement Learning Part 4

Published: 27 October 2022
on channel: Mutual Information

36,881

999

The machine learning consultancy: https://truetheta.io
Join my email list to get educational and useful articles (and nothing else!): https://mailchi.mp/truetheta/true-the...
Want to work together? See here: https://truetheta.io/about/#want-to-w...

Part four of a six part series on Reinforcement Learning. As the title says, it covers Temporal Difference Learning, Sarsa and Q-Learning, along with some examples.

SOCIAL MEDIA

LinkedIn :   / dj-rich-90b91753
Twitter :   / duanejrich
Github: https://github.com/Duane321

Enjoy learning this way? Want me to make more videos? Consider supporting me on Patreon:   / mutualinformation

SOURCES

[1] R. Sutton and A. Barto. Reinforcement learning: An Introduction (2nd Ed). MIT Press, 2018.

[2] H. Hasselt, et al. RL Lecture Series, Deepmind and UCL, 2021,    • DeepMind x UCL | Deep Learning Lectur...

SOURCE NOTES

The video covers topics from chapters 6 and 7 from [1]. The whole series teaches from [1]. [2] has been a useful secondary resource.

TIMESTAMP
0:00 What We'll Learn
0:52 No Review
1:18 TD as an Adjusted Version of MC
2:49 TD Visualized with a Markov Reward Process
6:34 N-Step Temporal Difference Learning
8:08 MC vs TD on an Evaluation Example
11:50 TD's Trade-Off between N and Alpha
12:47 Why does TD Perform Better than MC?
15:29 N-Step Sarsa
17:15 Why have N above 1?
19:02 Q-Learning
20:50 Expected Sarsa
21:48 Cliff Walking
25:04 Windy GridWorld
28:12 Watch the Next Video!

NOTES

Code to compare TD vs MC on the evaluation task: https://github.com/Duane321/mutual_in...

Watch video Temporal Difference Learning (including Q-Learning) | Reinforcement Learning Part 4 online without registration, duration hours minute second in high quality. This video was added by user Mutual Information 27 October 2022, don't forget to share it with your friends and acquaintances, it has been viewed on our site 36,881 once and liked it 999 people.

8,642

248

00:00:00