Why is piping 'dd' through gzip so much faster than a direct copy?

Question

I wanted to backup a path from a computer in my network to another computer in the same network over a 100 Mbit/s line. For this I did

dd if=/local/path of=/remote/path/in/local/network/backup.img

which gave me a very low network transfer speed of something about 50 to 100 kB/s, which would have taken forever. So I stopped it and decided to try gzipping it on the fly to make it much smaller so that the amount to transfer is less. So I did

dd if=/local/path | gzip > /remote/path/in/local/network/backup.img.gz

But now I get something like 1 MB/s network transfer speed, so a factor of 10 to 20 faster. After noticing this, I tested this on several paths and files, and it was always the same.

Why does piping dd through gzip also increase the transfer rates by a large factor instead of only reducing the bytelength of the stream by a large factor? I'd expected even a small decrease in transfer rates instead, due to the higher CPU consumption while compressing, but now I get a double plus. Not that I'm not happy, but I am just wondering. ;)

512 bytes was the standard block size for file storage in early Unix. Since everything is a file in Unix/Linux, it became the default for just about everything. Newer versions of most utilities have increased that but not dd. — DocSalvager, Jun 05 '14 at 20:04
The simple answer is that `dd` is outputting at 1MB/s... right into the waiting `gzip` pipe. It's got very little to do with block size. — Tullo_x86, Oct 21 '16 at 04:59
Actually, this doesn't have to be the case. I did real HDD imaging using `dd` and `gzip` (even with `--fast`) ramped the cpu to 100% and slowed down transmission speed by a factor of at least 7. — Cadoiz, Jan 19 '21 at 08:06

score 107 · Accepted Answer · 2014-05-29T09:32:53.850

107

dd by default uses a very small block size -- 512 bytes (!!). That is, a lot of small reads and writes. It seems that dd, used naively in your first example, was generating a great number of network packets with a very small payload, thus reducing throughput.

On the other hand, gzip is smart enough to do I/O with larger buffers. That is, a smaller number of big writes over the network.

Can you try dd again with a larger bs= parameter and see if it works better this time?

edited May 29 '14 at 09:32

answered May 29 '14 at 09:25

21

Thanks, tried direct copy *without* `gzip` and a blocksize of `bs=10M` -> fast network transfer of something about 3 or 4 MB/s. Higher blocksize + `gzip` did not change anything compared to small blocksize + `gzip`. – Foo Bar May 29 '14 at 14:27
7

If you want to see what high block sizes do try another dd after the gzip. – Joshua May 29 '14 at 16:05
Is gzip doing its own output buffering, or does it just use stdio? – Barmar May 30 '14 at 19:42
@Barmar If I'm reading the source correctly, it simply `write(3)`s to the buffer. – Jun 03 '14 at 12:28
@CongMa you can also try and use pigz instead of gzip, it will work even faster – GioMac Jan 28 '16 at 14:19
Never mind the fact that the transfer rate is being reported by `dd`, which is no longer responsible for the network bottleneck. – Tullo_x86 Oct 21 '16 at 04:58

score 5 · Answer 2 · answered Sep 07 '16 at 21:41

5

Bit late to this but might I add...

In an interview I was once asked what would be the quickest possible method for cloning bit-for-bit data and of coarse responded with the use of dd or dc3dd (DoD funded). The interviewer confirmed that piping dd to dd is more efficient, as this simply permits simultaneous Read/Write or in programmer terms stdin/stdout, thus ultimatly doubling write speeds and halfing transfer time.

dc3dd verb=on if=/media/backup.img | dc3dd of=/dev/sdb

answered Sep 07 '16 at 21:41

Get-Tek

51
1
3

1

I don't think that's true. I just tried now. `dd status=progress if=/dev/zero count=100000 bs=1M of=/dev/null` was 22.5GB/s, `dd status=progress if=/dev/zero count=100000 bs=1M | dd of=/dev/null bs=1M` was 2.7GB. So the pipe makes it slower. – falsePockets Feb 25 '19 at 00:27
1

Yeah, I thought maybe it was true for a real file-on-disk as opposed to /dev/zero and null, but I got similar results: `dd if=somefile.tmp of=somefile2.tmp bs=8M` was 1.1GB/s, while `dd if=somefile.tmp bs=8M | dd of=somefile2.tmp bs=8M` was 317MB/s. I guess it depends on the filesystem, kernel, and/or version of `dd`? This was on a 6 disk RAID10 of enterprise Intel SSDs, so 1GB/s is expected. `cp somefile.tmp somefile2.tmp` took the same amount of time give or take as `dd` without a pipe. – s.co.tt Apr 16 '21 at 00:31
I would expect that `dd` has the capability for simultaneous read/write without piping. That's why you're getting the numbers you get. On the other hand, if you can read, compress, and write, you should be able to save some time, if the compression algorithm is fast enough. – Nathan Garabedian Jul 22 '23 at 03:56

score 0 · Answer 3 · answered Oct 21 '16 at 04:57

I assume here that the "transfer speed" you're referring to is being reported by dd. This actually does make sense, because dd is actually transferring 10x the amount of data per second! However, dd is not transferring over the network -- that job is being handled by the gzip process.

Some context: gzip will consume data from its input pipe as fast as it can clear its internal buffer. The speed at which gzip's buffer empties depends on a few factors:

The I/O write bandwidth (which is bottlenecked by the network, and has remained constant)
The I/O read bandwidth (which is going to be far higher than 1MB/s reading from a local disk on a modern machine, thus is not a likely bottleneck)
Its compression ratio (which I will assume by your 10x speedup to be around 10%, indicating that you're compressing some kind of highly-repetitive text like a log file or some XML)

So in this case, the network can handle 100kB/s, and gzip is compressing the data around 10:1 (and isn't being bottlenecked by the CPU). This means that while it is outputting 100kB/s, gzip can consume 1MB/s, and the rate of consumption is what dd can see.

score 0 · Answer 4 · answered Jun 26 '14 at 23:31

Cong is correct. You are streaming the blocks off of disk uncompressed to a remote host. Your network interface, network, and your remote server are the limitation. First you need to get DD's performance up. Specifying a bs= parameter that aligns with the disks buffer memory will get the most performance from the disk. Say bs=32M for instance. This will then fill gzip's buffer at sata or sas line rate strait from the drives buffer. The disk will be more inclined to sequential transfer giving better through put. Gzip will compress the data in stream and send it to your location. If you are using NFS that will allow the nfs transmission to be minimial. If you are using SSH then you encur the SSH encapsulation and encryption overhead. If you use netcat then you have no encryption over head.

Why is piping 'dd' through gzip so much faster than a direct copy?

4 Answers4

Linked