Understanding STaR and how it powers Claude and Gemini/Gemma 2B (and maybe Q* or Strawberry). STaR is short for Self-Taught Reasoning and is rumored to power OpenAI's Q* (now Strawberry), but definitely powers Claude 3.5 sonnet and Gemma / Gemini models. In this video Chris breaks down how Self Taught reasoning works and how it is used in the fine tuned phases of a model to improve training. Chris also shows how you can use NVidia Nemotrons reward model to judge the outputs for STaR. If you want to understand how to use the same techniques that frontier AI models such as Anthropic Claude and Google Gemini / Gemma use to improve their fine tuning, then check out this video
Watch video Understanding STaR and how it powers Claude and Gemini/Gemma 2 (and maybe OpenAI Q* or Strawberry) online without registration, duration hours minute second in high quality. This video was added by user Chris Hay 15 July 2024, don't forget to share it with your friends and acquaintances, it has been viewed on our site 8,137 once and liked it 263 people.