BI-DIRECTIONAL ATTENTION | Explained in high level

Опубликовано: 23 Октябрь 2022
на канале: Data Science Garage

1,359

This video tutorial explains how Bi-Directional Attention works in NLP. This attention mechanism is very similar to Self-attention method which was introduced in the previous video. The main difference is that Bi-Directional attention do not have Masking operation, which Self-attention has.

Also, while self-attention look to the previous words (or tokens) only, the Bi-Directional attention looks to both sides. For this reason this attention called as Bi-Directional.

This video do not cover math for this method. This lesson explain the logic in a high level how Bi-Directional attention works. To implement this method, there are developed Python packages to do it.

Bi-Directional attention is widely used in BERT, which means: Bidirectional Encoder Representation from Transformers.

This is the 3rd video in the mini course about Attention in NLP. Check it out the previous ones:
1. Encoder-Decoder attention and Dot-Product: • ENCODER-DECODER Attention in NLP | Ho...
2. Self Attention: • SELF-ATTENTION in NLP | How does it w...
3. Bi-Directional Attention (this one).
4. Multi-Head attention (Upcoming).

You can read more about Bi-Directional Attention in the following sources:
Standford University: Bidirectional Attention Flow with Self-Attention: https://web.stanford.edu/class/archiv...
Medium.com article (BiDAF): https://towardsdatascience.com/the-de...

See you! - ‪@DataScienceGarage‬

#attention #nlp #bidirectional #tokenizer #bert #selfattention #BiDAF #multihead #python #dotproduct #naturallanguageprocessing

Смотрите видео BI-DIRECTIONAL ATTENTION | Explained in high level онлайн без регистрации, длительностью часов минут секунд в хорошем качестве. Это видео добавил пользователь Data Science Garage 23 Октябрь 2022, не забудьте поделиться им ссылкой с друзьями и знакомыми, на нашем сайте его посмотрели 1,359 раз и оно понравилось 24 людям.

710