Haipeng Luo (USC)
https://simons.berkeley.edu/talks/adv...
Data-Driven Decision Processes Boot Camp
The adversarial (a.k.a. non-stochastic) multi-armed bandit problem is an influential marriage between the online learning literature that concerns sequential decision making without distributional assumptions and the bandit literature that concerns learning with partial information feedback. This tutorial will give an overview of the theory and algorithms on this topic, starting from classical algorithms and their analysis and then moving on to advances in recent years on data-dependent regret guarantees, structural bandits, bandit with switching costs, combining bandit algorithms, and others. Special focus will be given to highlighting the similarities and differences between online learning with full-information feedback and that with bandit feedback.
Смотрите видео Adversarial Bandits: Theory and Algorithms онлайн без регистрации, длительностью часов минут секунд в хорошем качестве. Это видео добавил пользователь Simons Institute 01 Январь 1970, не забудьте поделиться им ссылкой с друзьями и знакомыми, на нашем сайте его посмотрели 3,018 раз и оно понравилось 44 людям.