Sasha Rush (Cornell University & Hugging Face)
https://simons.berkeley.edu/talks/sas...
Large Language Models and Transformers
Extrapolating scaling trends suggest that training dataset size for LLMs may soon be limited by the amount of text data available on the internet. In this talk we investigate scaling language models in data-constrained regimes. Specifically, we run a set of empirical experiments varying the extent of data repetition and compute budget. From these experiments we propose and empirically validate a scaling law for compute optimality that accounts for the decreasing value of repeated tokens and excess parameters. Finally, we discuss and experiment with approaches
for mitigating data scarcity.
Watch video Scaling Data-Constrained Language Models online without registration, duration hours minute second in high quality. This video was added by user Simons Institute 01 January 1970, don't forget to share it with your friends and acquaintances, it has been viewed on our site 4,534 once and liked it 84 people.