Function Approximation | Reinforcement Learning Part 5

Опубликовано: 16 Январь 2023
на канале: Mutual Information

26,363

788

The machine learning consultancy: https://truetheta.io
Join my email list to get educational and useful articles (and nothing else!): https://mailchi.mp/truetheta/true-the...
Want to work together? See here: https://truetheta.io/about/#want-to-w...

Here, we learn about Function Approximation. This is a broad class of methods for learning within state spaces that are far too large for our previous methods to work. This is part five of a six part series on Reinforcement Learning.

SOCIAL MEDIA

LinkedIn :   / dj-rich-90b91753
Twitter :   / duanejrich
Github: https://github.com/Duane321

Enjoy learning this way? Want me to make more videos? Consider supporting me on Patreon:   / mutualinformation

SOURCES

[1] R. Sutton and A. Barto. Reinforcement learning: An Introduction (2nd Ed). MIT Press, 2018.

[2] H. Hasselt, et al. RL Lecture Series, Deepmind and UCL, 2021,    • DeepMind x UCL | Deep Learning Lectur...

SOURCE NOTES

This video covers topics from chapters 9, 10 and 11 from [1], with only a light covering of chapter 11. [2] includes a lecture on Function Approximation, which was a helpful secondary source.

TIMESTAMP
0:00 Intro
0:25 Large State Spaces and Generalization
1:55 On Policy Evaluation
4:31 How do we select w?
6:46 How do we choose our target U?
9:27 A Linear Value Function
10:34 1000-State Random Walk
12:51 On Policy Control with FA
14:26 The Mountain Car Task
19:30 Off-Policy Methods with FA

LINKS
1000-State Random Walk Problem: https://github.com/Duane321/mutual_in...
Mountain Car Task: https://github.com/Duane321/mutual_in...

NOTES

[1] In the Mountain Car Task, I left out a hyperparameter to tune: Lambda. This controls how far away the evenly spaced proto-points are from any given evaluation point. If lambda is very high, the prototypical points are considered very close together, and they won't do a good job discriminating different values over the state space. But if lambda is too low, then the prototypical points won't share any information beyond a tiny region surrounding each point.

Смотрите видео Function Approximation | Reinforcement Learning Part 5 онлайн без регистрации, длительностью часов минут секунд в хорошем качестве. Это видео добавил пользователь Mutual Information 16 Январь 2023, не забудьте поделиться им ссылкой с друзьями и знакомыми, на нашем сайте его посмотрели 26,363 раз и оно понравилось 788 людям.

11,050