4

When I did a backup on my Linux system with tar -czv, I noticed that the process is stuck very long at /var/lib/docker/devicemapper/devicemapper/data (way longer than a copy of all my images and containers should take), while the result file does not grow.

By checking this file size with ls -lh, it outputs me a 100G size on a 20G partition. What kind of file is this and what is tar doing here?

Kamil Maciorowski
  • 69,815
  • 22
  • 136
  • 202
Milleu
  • 43
  • 3

1 Answers1

4

It looks like you have a sparse file there.

A sparse file is a type of computer file that attempts to use file system space more efficiently when the file itself is mostly empty. This is achieved by writing brief information (metadata) representing the empty blocks to disk instead of the actual "empty" space which makes up the block, using less disk space. The full block size is written to disk as the actual size only when the block contains "real" (non-empty) data.

There is this answer that says:

The /var/lib/docker/devicemapper/devicemapper directory contains the sparse loop files that contain all the other data that docker mounts.

Your file consists mostly of empty blocks (all zeros) and as a sparse file it can fit into your small partition. Apparently tar simply reads all the zeros and processes them. They compress very well, so the result file grows only a little when the stream of zeros ends.

There is a command line option to tar which makes it aware of sparse files. It is described here.

-S
--sparse

In your case the following is very important, I think:

On extraction (…) any such files have also holes created wherever the holes were found. (…) Consider using --sparse when performing file system backups, to avoid archiving the expanded forms of files stored sparsely in the system.

I guess you didn't use the --sparse option, so when it comes to extraction your 100G file will be created as non-sparse and it won't fit in the 20G partition.

Kamil Maciorowski
  • 69,815
  • 22
  • 136
  • 202