Managing many files with Disk ARchiver (DAR)

Published: 09 May 2019
on channel: WestDRI
1,714
22

Large parallel filesystems found on HPC clusters -- such as /home, /scratch and /project — have one weak spot: they were not designed for storing large numbers of small files. Due to this limitation, we always advise our users to reduce the number of files stored in their directories, either by instrumenting their code to write fewer larger files, or by using an archive tool such as the classic Unix utility `tar` to pack their files into archives.

There is a little-known, but incredibly useful open-source tool called `dar` that was developed as a faster, modern replacement to `tar`. DAR stands for `disk archive` and supports file indexing, differential and incremental backups, Linux file Access Control Lists (ACL), compression, symmetric and public key encryption, remote archives, and has many other nice features.

In this webinar we walk through several use cases for `dar` both on Compute Canada clusters and on your own laptop with a bash shell. We show you how to manage directories with many files, how to backup and restore your data, and other workflows.

To view / download the slides from this presentation, visit:
https://westgrid.github.io/trainingMa...

For information on other WestGrid events, visit:
https://www.westgrid.ca/events

Connect with WestGrid:
Mailing List - http://eepurl.com/dusEGr
Website - https://www.westgrid.ca
Technical Support - [email protected]
General Enquiries - [email protected]
Twitter - @WestGrid


Watch video Managing many files with Disk ARchiver (DAR) online without registration, duration hours minute second in high quality. This video was added by user WestDRI 09 May 2019, don't forget to share it with your friends and acquaintances, it has been viewed on our site 1,714 once and liked it 22 people.