Prediction as Data Compression

Опубликовано: 17 Сентябрь 2019
на канале: Google TechTalks
2,888
50

A Google TechTalk, 2019/08/20, presented by Jeffrey S. Vitter.
ABSTRACT: There is a close relationship between learning (or prediction) and data compression. The motivating intuition is that in order to compress data effectively, you need to have a good predictor for the data remaining to be processed, and thus good data compressors trained on previous events should predict future events well. We can use our predictor for applications such as prefetching pages from slow memory, deciding when to spin down a disk on a mobile computer, and estimating query output sizes in a database system.

We introduce universal predictors that are optimal in the limit in terms of error rate. Our novel approach is to use data compression techniques that are both theoretically optimal and good in practice. For prefetching on a sequence of page requests produced by a Markov source or mth-order Markov source (including those of infinite order), we show analytically that the page fault rates incurred by our prefetching algorithms are optimal in the limit for almost all sequences of page accesses. Simulations using data bases and CAD traces show that moderate use of this prefetching approach can reduce the page fault rate from the 80–100 percent range (with LRU caching only) to around 25–50 percent, achieving better rates than currently used prefetchers.

We can strengthen the result to the worst case and derive a randomized algorithm that we prove converges almost surely to the optimal page fault rate for every sequence, with respect to the class of finite state prefetchers. In particular we make no assumptions at all about how the sequence of page requests is generated. We can achieve the important computational goal of implementing our prefetcher in optimal constant expected time per prefetched page, using a novel application of the optimal and practical dynamic discrete random variate generator of Matias, Vitter, and Ni.

ABOUT THE SPEAKER: Dr. Jeffrey Vitter is Distinguished Professor of Computer and Information Science at the University of Mississippi (a.k.a. Ole Miss). He has 39 years of experience as a passionate advocate for higher education. He served on the faculty and in leadership roles at noted AAU universities Brown, Duke, Purdue, Texas A&M, and Kansas, before leading Ole Miss as its 17th chancellor from January 2016 to January 2019.

Fueled by his belief in the power of higher education to transform lives, communities, and the world, Dr. Vitter has charted the university's momentum through a dynamic strategic plan FlagshipForward.OleMiss.edu to achieve ever greater heights. Ole Miss is continuing its legacy of leadership in academic excellence with a $1 billion building program and capital planning, creation of superlative networks of faculty called Flagship Constellations, major engagement with communities through the M Partner initiative and the McLean Institute, and annual Technology Summits. Dr. Vitter has provided leadership to achieve a healthier Mississippi through greater capacity and reach of the University of Mississippi Medical Center.

Dr. Vitter is an internationally known computer scientist with research expertise in big data and data science, especially the algorithmic aspects of processing, compressing, and communicating massive amounts of information. He has been elected a Fellow of the Guggenheim Foundation, the National Academy of Inventors, the American Association for the Advancement of Science, the Association for Computing Machinery, and the Institute of Electrical and Electronics Engineers. He is a National Science Foundation Presidential Young Investigator, a Fulbright Scholar, and an IBM Faculty Development Awardee.

A native of New Orleans, Vitter graduated in mathematics with highest honors from the University of Notre Dame in 1977 and earned a Ph.D. under Don Knuth in computer science at Stanford University in 1980. He also holds an MBA in 2002 from Duke University.


Смотрите видео Prediction as Data Compression онлайн без регистрации, длительностью часов минут секунд в хорошем качестве. Это видео добавил пользователь Google TechTalks 17 Сентябрь 2019, не забудьте поделиться им ссылкой с друзьями и знакомыми, на нашем сайте его посмотрели 2,888 раз и оно понравилось 50 людям.