ddrescue: proper command to make copy of a copy

Question

I am copying my failing/dying hard drive with ddrescue, and after almost 15 days, the process is around 84% and expects to finish Pass 1, in next 4 or 5 days hopefully.

History (With reference to another question I posted previously): The source drive is a 2TB Seagate Baracuda drive, used as a second drive in windows 10 computer, had four partitions, first three 500GB each and the last one around 400 (not exactly 500 was left to make 4th partition).

HDD was plugged in using Ubuntu linux. The drive and partitions appeared in ubuntu but none of the partition was actually accessible and caused the system to respond very very slow. I mistakenly formatted the drive withext4 while trying to install ubuntu. The install finished, but the system never booted from this disk.

Then we tried to recover with TestDisk, but after 10 days, we stopped. (Test disk had identified first 500 gb NTFS partition after a deepscan)

Finally, attempted to copy the disk to a new hard disk using ddrescue.

The plan is to make another copy from this intermediate copy. The intermediate drive is a WD 4TB drive.

For the second copy, I have purchased another Seagate 4TB drive. Plan is to try to recover the NTFS partitions using Testdisk later. The files of interest are probably in second or third 500GB partition.

For the first copy, I used this command:

ddrescue -d -f -r3 /dev/sdb /dev/sdc sdc.log

The Capacity of original failing drive is 2,000,398,934,016 bytes [2.00 TB], according to SMARTTools.

I understand adding the -s parameter and make the target size slightly higher than the original size. So I think adding around 10 mb on top of the original size in bytes is enough, am I right? 10 mb = 10,000,000 bytes. Shall I increase it a bit more, say 100 mb?

Secondly, I think we don't need to use the -r3 parameter as well. So, the command would be something like this:

ddrescue -d -f -s 2000408934016 /dev/sdb /dev/sdc sdc.log

What else should I add to the command (or remove) to start ddrescue for the second copy?

First of all you should copy `sdc` (not `sdb`) to `sdd` or something *else*; and you should not use the old mapfile. — Kamil Maciorowski, Mar 05 '23 at 04:44
Thank and yes. The drive letters would change, I will remove the first drive out of system, and put the intermediate drive and the third drive in place. So the drive names will be in order accordingly. Same goes with the map file, I should rename it accordingly. — Irfan, Mar 05 '23 at 04:59

Kamil Maciorowski · Answer 1 · 2023-03-16T09:21:46.900

First of all you should copy sdc (not sdb) to sdd or something else; and you should not use the old mapfile sdc.log as the mapfile again.

Even if you're going to reconnect the drives, so their names change, in this answer I will strictly use:
- sdb for the original drive only,
- sdc for the intermediate drive (first copy) only,
- sdx for the new drive (second copy) only,
- sdc.log for the old mapfile only.
I understand adding the -s parameter and make the target size slightly higher than the original size. […] Shall I increase it a bit more […]?

There is no need. In my answer to your first question I wrote "if there is any doubt, use a larger number". If you know the exact size of sdb (and it seems you do) then you can use the exact number.
I think we don't need to use the -r3 parameter as well.

I agree. The premise is sdc is healthy and every sector can be read in the first try. Just in case you may add -r3 anyway; if sdc is healthy then the option won't matter at all. But even if you don't add the option and sdc turns out somewhat faulty, it will be possible to repeat the command with -r… and ddrescue will try to read bad sectors again. (And you can repeat again and again, at will).
-d (direct access) is also not needed. While reading from a healthy disk you may use any enhancement the OS gives you and this approach should only make things better.

The command will be like:

ddrescue -f -s 2000398934016 /dev/sdc /dev/sdx sdx.log

Frankly you don't even need ddrescue with its advanced engine designed to deal with read errors. If sdc is healthy then the following should work as well:

head -c 2000398934016 /dev/sdc >/dev/sdx

(Note you need an elevated shell for the redirection to work, sudo head … is not enough; sudo head … | sudo tee /dev/sdx >/dev/null should work, see this answer).

head will not create sdx.log, but if sdc is healthy then you don't need this file anyway (you should treat sdc.log as relevant for both copies). On the other hand even if you consider sdc healthy now, you can never really be sure it manages to stay healthy during the operation, so using ddrescue is a good idea. ddrescue will be an advantage if sdc develops read errors. We hope it won't, it probably won't.

There is one method you should consider. I did not include it in the already linked answer, now it's time to introduce it to you. You can use sdc.log as a domain mapfile. From the manual of GNU ddrescue:

-m file
--domain-mapfile=file
Restrict the rescue domain to the blocks marked as finished in the mapfile file. […]

The command will be like:

ddrescue -f -m sdc.log /dev/sdc /dev/sdx sdx.log

Advantages:

You don't need to know the size of the original drive; there is no need for -s. The point is that blocks marked as finished in sdc.log obviously cannot reach beyond the size of sdb.
Thanks to reading only what your first ddrescue (from sdb to sdc) copied successfully, you won't copy irrelevant data in vain. This will improve performance if non-finished (in practice: erroneous) fragments indicated in sdc.log are large; our ddrescue -m … will simply skip them.
If there are read errors (i.e. sdc not really healthy), sdx.log will reflect them. Ultimately it will also reflect non-finished fragments indicated in sdc.log. In any case sdx.log will describe the actual state of the copy on sdx. (Side note: anticipating read errors, you may add -r… at the first try; or later, because the remarks about -r (above in this answer) apply here as well).

In case of read errors from sdc, the fact that sdx.log describes the actual state of the copy on sdx will allow you to (try to) re-read the missing parts from sdb. Just replace sdc with sdb in the command (keep -m … and sdx.log, add -r… at will).

Disadvantage:

If non-finished (in practice: erroneous) fragments indicated in sdc.log are small but in a great number, skipping them while reading may actually worsen the performance. As sdc is most likely healthy, reading it continuously from the beginning to ~~the end~~ the point indicated by -s 2000398934016 may be a better idea.

Without seeing the complete sdc.log, it's hard to guess which variant (-m or -s) will perform better for you. In some cases knowing the mapfile is not really enough to tell for sure, there will be some guessing (hopefully educated guessing) anyway.

If I were you and if I reasonably trusted sdc, I would probably go with -s and read from sdc continuously. Assuming this would work without errors, in the end I should consider sdc.log as relevant for both copies.

Thanks a lot. The first copy drive sdc is brand new drive, actually just purchased a month earlier. At the moment, there are 354 read errors showing while still going through Pass 1. — Irfan, Mar 05 '23 at 11:46

ddrescue: proper command to make copy of a copy

1 Answers1