Vitaly Feldman (Apple ML Research)
https://simons.berkeley.edu/node/22921
Societal Considerations and Applications
Deep learning algorithms that achieve state-of-the-art results on image and text recognition tasks tend to fit the entire training dataset (nearly) perfectly including mislabeled examples and outliers. This propensity to memorize seemingly useless data and the resulting large generalization gap have puzzled many practitioners and is not explained by existing theories of machine learning. We provide a simple conceptual explanation and a theoretical model demonstrating that memorization of outliers and mislabeled examples is necessary for achieving close-to-optimal generalization error when learning from long-tailed data distributions. Image and text data are known to follow such distributions and therefore our results establish a formal link between these empirical phenomena. We then demonstrate the utility of memorization and support our explanation empirically. These results rely on a new technique for efficiently estimating memorization and influence of training data points. Our results allow us to quantify the cost of limiting memorization in learning and explain the disparate effects that privacy and model compression have on different subgroups.
Watch video Chasing the Long Tail: What Neural Networks Memorize and Why online without registration, duration hours minute second in high quality. This video was added by user Simons Institute 01 January 1970, don't forget to share it with your friends and acquaintances, it has been viewed on our site 2,078 once and liked it 28 people.