Let's build GPT with memory: learn to code a custom LLM (Coding a Paper - Ep. 1)

Опубликовано: 18 Январь 2024
на канале: ChrisMcCormickAI
11,441
524

You've used an LLM before, and you might've even fine-tuned one, but...have you ever built one yourself? How do you start from scratch and turn a new research idea into a working model? This is the skill used by industry and academic researchers to turn cutting edge research ideas into production quality code. That's what we'll do in this series: we're going to implement a Google research paper!

By the end of this course you'll have deep expertise of how a production-grade transformer model like GPT works end to end. You'll also have the ability to implement new research papers, implement your own research ideas, and comfortably modify and experiment with existing implementations.

Along the way we'll cover lots of technical and non-technical topics to develop this skillset:
A lot of pytorch
How to select and critically read a paper / research idea
How to get up to speed in an unfamiliar area of research with tools and tricks
A recipe for breaking down big research papers into achievable, bite-size chunks
Tips and tricks I wish I had known
Understanding and implementing recurrence
Building GPT from scratch
A survey of position embedding research
Building a KNN vector database for memory
How to put it all together into a working model

Links:
Link to Colab Notebook: https://colab.research.google.com/dri...
You can follow me on twitter:   / nickcdryan  
Check out the membership site for a full course version of the series (coming soon) and lots of other NLP content and code! https://www.chrismccormick.ai/membership

Chapters:
00:00 introduction and goals of the course
02:08 why this course?
03:14 prerequisites and what you should know beforehand
03:40 how to get the most value out of these videos
05:44 our high level plan for this course
06:29 tips for picking a paper
08:00 reference code implementations
08:56 tip for reading research papers
10:03 guide your reading with checklist questions
11:43 what’s the main idea of memorizing transformers?
12:17 what’s the motivation for memorizing transformers?
14:36 what’s the proposed solution?
17:03 what’s the main contribution of the paper?
19:58 how do they measure success of the model?
21:24 wait: how good is this model?
24:55 more follow up questions about the paper
26:45 let’s read the paper!
37:28 summary of what we’ll build
38:37 tips for building effectively
39:43 what’s coming next!


Смотрите видео Let's build GPT with memory: learn to code a custom LLM (Coding a Paper - Ep. 1) онлайн без регистрации, длительностью часов минут секунд в хорошем качестве. Это видео добавил пользователь ChrisMcCormickAI 18 Январь 2024, не забудьте поделиться им ссылкой с друзьями и знакомыми, на нашем сайте его посмотрели 11,441 раз и оно понравилось 524 людям.