Self-Attention Equations - Math + Illustrations

Опубликовано: 23 Сентябрь 2022
на канале: ChrisMcCormickAI
4,785
161

I created this video as supplemental material for my new video course on Decoder-based Transformer models such as GPT-3. Check out the course here!
https://www.chrismccormick.ai/the-inn...

==== Overview ====
The mathematical equations for Multi-Headed Attention can be a little daunting, given the number of steps and variables! They’re certainly a difficult place to start in trying to understand the algorithm for the first time.

In this tutorial, I’ll walk through an illustrated explanation of Multi-Headed Attention, but also show how each step maps to the original equations.

==== Pre-Reqs ====
This video assumes some familiarity with Self-Attention in Transformer models, so if you're brand new to those concepts you probably want to start with something like my GPT course linked above, or my "BERT Research" series on YouTube, to provide all of the context that you'll need.

==== Added Insights ====
I’ll also share some new perspectives on what Attention is doing, based on a couple “Bertology” papers I’ve studied, plus my own interpretation of the math. These insights are particularly helpful, I believe, for understanding how Attention can be applied in ways outside of just the Self-Attention mechanism in NLP language models like BERT and GPT.

==== Student Discount ====
As mentioned in the video, students / low income learners can apply for financial aid here:
https://www.chrismccormick.ai/student...


Смотрите видео Self-Attention Equations - Math + Illustrations онлайн без регистрации, длительностью часов минут секунд в хорошем качестве. Это видео добавил пользователь ChrisMcCormickAI 23 Сентябрь 2022, не забудьте поделиться им ссылкой с друзьями и знакомыми, на нашем сайте его посмотрели 4,785 раз и оно понравилось 161 людям.