Coding a Paper - Ep. 4: Adding in Position Embeddings

Опубликовано: 15 Февраль 2024
на канале: ChrisMcCormickAI
843
39

In the last episode we built self-attention but left out a key ingredient: position embeddings. On its own, self-attention doesn’t provide the model with any information about where in a sentence different words are in relation to one another. “John loves Mary,” “Mary loves John,” and “loves John Mary” all look the same to self-attention! That’s why we add in explicit information about the sequence order of words with positional embeddings.

Memorizing Transformers uses a special positional encoding scheme from the T5 paper called relative position bias. In this video we’ll refer back to the T5 paper and build out the special position embedding scheme line by line.

Links:
Link to Colab Notebook: https://colab.research.google.com/dri...
You can follow me on twitter:   / nickcdryan  
Check out the membership site for a full course version of the series (coming soon) and lots of other NLP content and code! https://www.chrismccormick.ai/membership


Chapters:
00:00 introduction
00:33 what is relative position bias?
01:57 T5 relative position bias
04:25 building a “vanilla” relative position matrix
09:23 first mask for exact indices
10:59 second mask for log scaled indices
13:30 creating a T5 relative position matrix
14:46 initialize the positional embedding weights
17:27 reshape embeddings for multihead self-attention
19:25 result: relative position embedding class


Смотрите видео Coding a Paper - Ep. 4: Adding in Position Embeddings онлайн без регистрации, длительностью часов минут секунд в хорошем качестве. Это видео добавил пользователь ChrisMcCormickAI 15 Февраль 2024, не забудьте поделиться им ссылкой с друзьями и знакомыми, на нашем сайте его посмотрели 843 раз и оно понравилось 39 людям.