Let's build GPT with memory: learn to code a custom LLM (Coding a Paper - Ep. 1)

Published: 18 January 2024
on channel: ChrisMcCormickAI
11,441
524

You've used an LLM before, and you might've even fine-tuned one, but...have you ever built one yourself? How do you start from scratch and turn a new research idea into a working model? This is the skill used by industry and academic researchers to turn cutting edge research ideas into production quality code. That's what we'll do in this series: we're going to implement a Google research paper!

By the end of this course you'll have deep expertise of how a production-grade transformer model like GPT works end to end. You'll also have the ability to implement new research papers, implement your own research ideas, and comfortably modify and experiment with existing implementations.

Along the way we'll cover lots of technical and non-technical topics to develop this skillset:
A lot of pytorch
How to select and critically read a paper / research idea
How to get up to speed in an unfamiliar area of research with tools and tricks
A recipe for breaking down big research papers into achievable, bite-size chunks
Tips and tricks I wish I had known
Understanding and implementing recurrence
Building GPT from scratch
A survey of position embedding research
Building a KNN vector database for memory
How to put it all together into a working model

Links:
Link to Colab Notebook: https://colab.research.google.com/dri...
You can follow me on twitter:   / nickcdryan  
Check out the membership site for a full course version of the series (coming soon) and lots of other NLP content and code! https://www.chrismccormick.ai/membership

Chapters:
00:00 introduction and goals of the course
02:08 why this course?
03:14 prerequisites and what you should know beforehand
03:40 how to get the most value out of these videos
05:44 our high level plan for this course
06:29 tips for picking a paper
08:00 reference code implementations
08:56 tip for reading research papers
10:03 guide your reading with checklist questions
11:43 what’s the main idea of memorizing transformers?
12:17 what’s the motivation for memorizing transformers?
14:36 what’s the proposed solution?
17:03 what’s the main contribution of the paper?
19:58 how do they measure success of the model?
21:24 wait: how good is this model?
24:55 more follow up questions about the paper
26:45 let’s read the paper!
37:28 summary of what we’ll build
38:37 tips for building effectively
39:43 what’s coming next!


Watch video Let's build GPT with memory: learn to code a custom LLM (Coding a Paper - Ep. 1) online without registration, duration hours minute second in high quality. This video was added by user ChrisMcCormickAI 18 January 2024, don't forget to share it with your friends and acquaintances, it has been viewed on our site 11,441 once and liked it 524 people.