42

So here's what's happening.

I started a backup of a drive on my server through a Linux live USB. I started copying the first drive with the dd command vanilla; just sudo dd if=/dev/sda of=/dev/sdc1 and then I remembered that this just leaves the console blank until it finishes.

I needed to run a different backup to the same drive anyway, so I started that one as well with sudo dd if=/dev/sdb of=/dev/sdc3 status=progress and then I got a line of text that shows the current rate of transfer as well as the progress in bytes.

I was hoping for a method that shows a percentage of the backup instead of doing the math of how many bytes are backed up out of 1.8TBs. Is there an easier way to do this than status=progress?

Hastur
  • 18,764
  • 9
  • 52
  • 95

5 Answers5

69

See answers from this question [1]

pv

For example you can use pv before you start

sudo apt-get install pv    # if you do not have it
pv < /dev/sda > /dev/sc3   # it is reported to be faster
pv /dev/sda > /dev/sc3     # it seems to have the same speed of the previous one
#or 
sudo dd if=/dev/sda | pv -s 1844G | dd of=/dev/sdc3  # Maybe slower 

Output [2]:

440MB 0:00:38 [11.6MB/s] [======>                             ] 21% ETA 0:02:19

Notes:
Especially for large files you may want to see man dd and set the options needed to speed up all on your hardware, e.g. bs=100M to set the buffer, oflag=sync to count the effective bytes written, maybe direct...
The option -s only takes integer parameters so 1.8T-->1844G.
As you can notice from the first lines you do not need dd at all.


kill -USR1 pid

If you already launched the dd command, once you have individuated its PID (Ctrl-Z +bg and you read it , or pgrep ^dd ... ) you may send a signal USR1 (or SIGUSR1, or SIGINFO see below) and read the output.
If the PID of the program is 1234 with

kill -USR1 1234

dd will answer on the terminal of its STDERR with something similar to

4+1 records in
4+0 records out
41943040 bytes (42 MB) copied, 2.90588 s, 14.4 MB/s

Warning: Under OpenBSD you may have to check in advance the behaviour of kill[3]: use instead
kill -SIGINFO 1234.
It exists the sigaction named SIGINFO. TheSIGUSR1 one, in this case, should terminate the program (dd)...
Under Ubuntu use -SIGUSR1 (10).

Hastur
  • 18,764
  • 9
  • 52
  • 95
  • Thank you for the help! this will definitely help with the whole process. I will try the pv < /dev/sda > /dev/sdc3 method and hope that its faster as it reports. I had to cancel the last run of this and turn the server back on today because everyone in my office had been complaining, however this will help with having a definite percentage to fall back on when I am not sure how much time left that I should tell them. Im interested to see the ETA when I get it going again this friday! hahaha. –  Jan 30 '18 at 23:14
  • 9
    you'll almost certainly find that using 'bs' on the dd command hugely speeds it up. Like dd if=/dev/blah of=/tmp/blah bs=100M to transfer 100M blocks at a time – Sirex Jan 31 '18 at 01:49
  • 1
    @Sirex Of course you have to set the bs to optimize the transfer rate in relation with your hardware... In the answer is just repeated the commandline of the OP. :-) – Hastur Jan 31 '18 at 08:05
  • Note that you can also just do `pv /dev/sda` directly if you want. Also, on the output, you *may* want to add `oflag=sync`, otherwise the command completes really quickly, and then sits there silently flushing for ages. The sync flag makes it wait for the data to *actually* write to disk. – MathematicalOrchid Jan 31 '18 at 09:15
  • Excellent answer. Do note that signalling USR1 to dd can take a while to process. I've done it writing to a USB drive, and the answer only appeared after the writes were finished. – Criggie Feb 01 '18 at 01:25
  • @Hastur bs doesn’t matter in OP’s case, only if e.g. he is piping his dd B(uffer)S(size) to something data-mangling, like gz, xz, gpg, foo... bs is internally 64k and there’ll be no extra love for making bs bigger when just writing to a similar drive - if anything, there will be a delay in between reads and writes. – user2497 Feb 01 '18 at 06:03
  • @Sirex: `100M` is *way* too large, especially if writing to a pipe. Pipe buffers are much smaller than 100MB, so there's no point making a `write()` system call with that size; it will return early. `bs=1M` is ok. I often use `bs=128k`, which is half of L2 cache size on my CPU; it's a tradeoff between more system calls and reading memory that's still hot in cache from being written. – Peter Cordes Feb 01 '18 at 10:53
  • 3
    @Criggie: that's maybe because `dd` had already finished all the `write()` system calls, and `fsync` or `close` was blocked waiting for the writes to reach disk. With a slow USB stick, the default Linux I/O buffer thresholds for how large dirty write-buffers can be leads to qualitatively different behaviour than with big files on fast disks, because the buffers are as big as what you're copying and it still takes noticeable time. – Peter Cordes Feb 01 '18 at 11:00
  • 5
    Great answer. However, I do want to note that in OpenBSD the right kill signal is SIGINFO, not SIGUSR1. Using -USR1 in OpenBSD will just kill dd. So before you try this out in a new environment, on a transfer that you don't want to interrupt, you may want to familiarize yourself with how the environment acts (on a safer test). – TOOGAM Feb 02 '18 at 05:17
  • 1
    the signals advice for `dd` is really great info, especially for servers where you can't/don't want to install `pv` – mike Feb 03 '18 at 11:48
39

My go-to tool for this kind of stuff is progress:

This tool can be described as a Tiny, Dirty, Linux-and-OSX-Only C command that looks for coreutils basic commands (cp, mv, dd, tar, gzip/gunzip, cat, etc.) currently running on your system and displays the percentage of copied data. It can also show estimated time and throughput, and provides a "top-like" mode (monitoring).

"<code>progress</code> in action" screenshot

It simply scans /proc for interesting commands, and then looks at directories fd and fdinfo to find opened files and seek positions, and reports status for the largest file.

It's very light, and compatible with virtually any command.

I find it particularly useful because:

  • compared to pv in pipe or dcfldd, I don't have to remember to run a different command when I start the operation, I can monitor stuff after the fact;
  • compared to kill -USR1, it works on virtually any command, I don't have to always double-check the manpage to make sure I'm not accidentally killing the copy; also, it's nice that, when invoked without parameters, it shows the progress for any common "data transfer" command currently running, so I don't even have to look up the PID;
  • compared to pv -d, again I don't need to look up the PID.
Matteo Italia
  • 1,559
  • 11
  • 16
27

Run dd, then, in a separate shell, invoke the following command:

pv -d $(pidof dd) # root may be required

This will make pv obtain statistics on all the opened file descriptors of the dd process. It will show you both where the read and write buffer sit.

sleblanc
  • 438
  • 3
  • 11
  • 3
    Works after the fact!? Amazing!! – jpaugh Jan 31 '18 at 21:16
  • 3
    That's very cool. It avoids the memory-bandwidth + context-switch overhead of actually piping all the data through 3 processes! @jpaugh: I guess it just looks at `/proc/$PID/fdinfo` for file positions, and at `/proc/$PID/fd` to see *which* files (and thus the sizes). So yes, very cool, and good idea for a feature, but I wouldn't call it "amazing" because there are Linux APIs that let it poll the file positions of another process. – Peter Cordes Feb 01 '18 at 10:56
  • @PeterCordes I didn't realize file-position was exposed by the kernel. (I've been spending my life carefully preparing `pv` pipelines in advance.) Of course, I assumed as much once I saw that this does work. – jpaugh Feb 01 '18 at 15:05
10

There's an alternative to dd : dcfldd.

dcfldd is an enhanced version of GNU dd with features useful for forensics and security.

Status output - dcfldd can update the user of its progress in terms of the amount of data transferred and how much longer operation will take.

dcfldd if=/dev/zero of=out bs=2G count=1 # test file
dcfldd if=out of=out2 sizeprobe=if
[80% of 2047Mb] 52736 blocks (1648Mb) written. 00:00:01 remaining.

http://dcfldd.sourceforge.net/
https://linux.die.net/man/1/dcfldd

Antonin Décimo
  • 209
  • 1
  • 3
6

As a percentage you'd have to do some maths, but you can get the progress of a dd in human readable form, even after already starting, by doing kill -USR1 $(pidof dd)

The current dd process will display similar to:

11117279 bytes (11 MB, 11 MiB) copied, 13.715 s, 811 kB/s

Sirex
  • 10,990
  • 6
  • 43
  • 57