In this video, we are going to implement the GPT2 model from scratch. We are only going to focus on the inference and not on the training logic. We will cover concepts like self attention, decoder blocks and generating new tokens.
Paper: https://openai.com/blog/better-langua...
Code minGPT: https://github.com/karpathy/minGPT
Code transformers: https://github.com/huggingface/transf...
Code from the video: https://github.com/jankrepl/mildlyove...
00:00 Intro
01:32 Overview: Main goal [slides]
02:06 Overview: Forward pass [slides]
03:39 Overview: GPT module (part 1) [slides]
04:28 Overview: GPT module (part 2) [slides]
05:25 Overview: Decoder block [slides]
06:10 Overview: Masked self attention [slides]
07:52 Decoder module [code]
13:40 GPT module [code]
18:19 Copying a tensor [code]
19:26 Copying a Decoder module [code]
21:04 Copying a GPT module [code]
22:13 Checking if copying works [code]
26:01 Generating token strategies [demo]
29:10 Generating a token function [code]
32:34 Script (copying + generating) [code]
35:59 Results: Running the script [demo]
40:50 Outro
If you have any video suggestions or you just wanna chat feel free to join the discord server: / discord
Twitter: / moverfitted
Credits logo animation
Title: Conjungation · Author: Uncle Milk · Source: / unclemilk · License: https://creativecommons.org/licenses/... · Download (9MB): https://auboutdufil.com/?id=600
Watch video GPT in PyTorch online without registration, duration hours minute second in high quality. This video was added by user mildlyoverfitted 31 January 2022, don't forget to share it with your friends and acquaintances, it has been viewed on our site 11,006 once and liked it 296 people.