Function Approximation | Reinforcement Learning Part 5

Published: 16 January 2023
on channel: Mutual Information
26,363
788

The machine learning consultancy: https://truetheta.io
Join my email list to get educational and useful articles (and nothing else!): https://mailchi.mp/truetheta/true-the...
Want to work together? See here: https://truetheta.io/about/#want-to-w...

Here, we learn about Function Approximation. This is a broad class of methods for learning within state spaces that are far too large for our previous methods to work. This is part five of a six part series on Reinforcement Learning.

SOCIAL MEDIA

LinkedIn :   / dj-rich-90b91753  
Twitter :   / duanejrich  
Github: https://github.com/Duane321

Enjoy learning this way? Want me to make more videos? Consider supporting me on Patreon:   / mutualinformation  

SOURCES

[1] R. Sutton and A. Barto. Reinforcement learning: An Introduction (2nd Ed). MIT Press, 2018.

[2] H. Hasselt, et al. RL Lecture Series, Deepmind and UCL, 2021,    • DeepMind x UCL | Deep Learning Lectur...  

SOURCE NOTES

This video covers topics from chapters 9, 10 and 11 from [1], with only a light covering of chapter 11. [2] includes a lecture on Function Approximation, which was a helpful secondary source.

TIMESTAMP
0:00 Intro
0:25 Large State Spaces and Generalization
1:55 On Policy Evaluation
4:31 How do we select w?
6:46 How do we choose our target U?
9:27 A Linear Value Function
10:34 1000-State Random Walk
12:51 On Policy Control with FA
14:26 The Mountain Car Task
19:30 Off-Policy Methods with FA

LINKS
1000-State Random Walk Problem: https://github.com/Duane321/mutual_in...
Mountain Car Task: https://github.com/Duane321/mutual_in...

NOTES

[1] In the Mountain Car Task, I left out a hyperparameter to tune: Lambda. This controls how far away the evenly spaced proto-points are from any given evaluation point. If lambda is very high, the prototypical points are considered very close together, and they won't do a good job discriminating different values over the state space. But if lambda is too low, then the prototypical points won't share any information beyond a tiny region surrounding each point.


Watch video Function Approximation | Reinforcement Learning Part 5 online without registration, duration hours minute second in high quality. This video was added by user Mutual Information 16 January 2023, don't forget to share it with your friends and acquaintances, it has been viewed on our site 26,363 once and liked it 788 people.