FAST '23 - InftyDedup: Scalable and Cost-Effective Cloud Tiering with Deduplication

Published: 02 March 2023
on channel: USENIX
1,045
15

InftyDedup: Scalable and Cost-Effective Cloud Tiering with Deduplication

Iwona Kotlarska, Andrzej Jackowski, Krzysztof Lichota, Michal Welnicki, and Cezary Dubnicki, 9LivesData, LLC; Konrad Iwanicki, University of Warsaw

Cloud tiering is the process of moving selected data from on-premise storage to the cloud, which has recently become important for backup solutions. As subsequent backups usually contain repeating data, deduplication in cloud tiering can significantly reduce cloud storage utilization, and hence costs.

In this paper, we introduce InftyDedup, a novel system for cloud tiering with deduplication. Unlike existing solutions, it maximizes scalability by utilizing cloud services not only for storage but also for computation. Following a distributed batch approach with dynamically assigned cloud computation resources, InftyDedup can deduplicate multi-petabyte backups from multiple sources at costs on the order of a couple of dollars. Moreover, by selecting between hot and cold cloud storage based on the characteristics of each data chunk, our solution further reduces the overall costs by up to 26%–44%. InftyDedup is implemented in a state-of-the-art commercial backup system and evaluated in the cloud of a hyperscaler.

View the full FAST '23 Technical Sessions at https://www.usenix.org/conference/fast23


Watch video FAST '23 - InftyDedup: Scalable and Cost-Effective Cloud Tiering with Deduplication online without registration, duration hours minute second in high quality. This video was added by user USENIX 02 March 2023, don't forget to share it with your friends and acquaintances, it has been viewed on our site 1,045 once and liked it 15 people.