1

I have a failing hard drive and after first using Testdisk which was going on very slow, I am using ddrescue to copy the disk. The original disk is a 2TB drive and had 4 NTFS partitions (which I had mistakenly formatted with ext4 on live ubuntu system).

Now, I used ddrescue on live ubuntu with sudo ddrescue -d -f -r3 /dev/sdb /dev/sdc sdc.log

I used a new 4TB hard drive for copying, now I want to make a second copy. And for the second copy, I have another new 4TB drive which is different from the first copy drive. I read that the target drive must be same size or at least bigger. Now my intermediate drive is bigger than the first failing drive (2TB). What if my second target drive is few MBs smaller than the intermediate drive? Will ddrescue fail to write the data on the second drive? Or will it stop writing after the original 2TB data ends on the first copy drive?

What shall I do in either case?

Thanks in advance.

UPDATE 21Feb2023: Below is the log file text (I copied the sdc.log file on a usb and opened it separately):

# Mapfile. Created by GNU ddrescue version 1.26
# Command line: ddrescue -d -f -r3 /dev/sdb /dev/sdc sdc.log
# Start time:   2023-02-18 05:03:10
# Current time: 2023-02-21 10:47:14
# Copying non-tried blocks... Pass 1 (forwards)
# current_pos  current_status  current_pass
0x610FA80000     ?               1
#      pos        size  status
0x00000000  0x47AC630000  +
0x47AC630000  0x00010000  *
0x47AC640000  0x01320000  ?
0x47AD960000  0x479140000  +
0x4C26AA0000  0x00010000  *
0x4C26AB0000  0x01320000  ?
0x4C27DD0000  0x2FA860000  +
0x4F22630000  0x00010000  *
0x4F22640000  0x01320000  ?
0x4F23960000  0x1F7230000  +
0x511AB90000  0x00010000  *
0x511ABA0000  0x01320000  ?
0x511BEC0000  0x32780000  +
0x514E640000  0x00010000  *
0x514E650000  0x01320000  ?
0x514F970000  0xC5D00000  +
0x5215670000  0x00010000  *
----------------------------------------------------

Here is the display in enter image description herethe terminal:

Secondly, when I tried to run smarttools with smartctl -a /dev/sdb >myreport"


Here is the report from SMART tools: (ran with "sudo smartctl -a /dev/sdb >myreport")

smartctl 7.3 2022-02-28 r5338 [x86_64-linux-5.19.0-21-generic] (local build)
Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.14 (AF)
Device Model:     ST2000DM001-1ER164
Serial Number:    W4Z3P9PN
LU WWN Device Id: 5 000c50 09b8366dd
Firmware Version: CC26
User Capacity:    2,000,398,934,016 bytes [2.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database 7.3/5319
ATA Version is:   ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Tue Feb 21 11:10:57 2023 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
See vendor-specific Attribute list for marginal Attributes.

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                    was completed without error.
                    Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                    without error or no self-test has ever 
                    been run.
Total time to complete Offline 
data collection:        (   80) seconds.
Offline data collection
capabilities:            (0x7b) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    Offline surface scan supported.
                    Self-test supported.
                    Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                    General Purpose Logging supported.
Short self-test routine 
recommended polling time:    (   1) minutes.
Extended self-test routine
recommended polling time:    ( 209) minutes.
Conveyance self-test routine
recommended polling time:    (   2) minutes.
SCT capabilities:          (0x1085) SCT Status supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   110   080   006    Pre-fail  Always       -       215372854
  3 Spin_Up_Time            0x0003   096   095   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   096   096   020    Old_age   Always       -       4686
  5 Reallocated_Sector_Ct   0x0033   079   079   010    Pre-fail  Always       -       26840
  7 Seek_Error_Rate         0x000f   082   060   030    Pre-fail  Always       -       168010836
  9 Power_On_Hours          0x0032   064   064   000    Old_age   Always       -       32234
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   096   096   020    Old_age   Always       -       4690
183 Runtime_Bad_Block       0x0032   091   091   000    Old_age   Always       -       9
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   001   001   000    Old_age   Always       -       3934
188 Command_Timeout         0x0032   100   052   000    Old_age   Always       -       112 301 483
189 High_Fly_Writes         0x003a   098   098   000    Old_age   Always       -       2
190 Airflow_Temperature_Cel 0x0022   062   034   045    Old_age   Always   In_the_past 38 (Min/Max 26/41 #1798)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       191
193 Load_Cycle_Count        0x0032   001   001   000    Old_age   Always       -       224369
194 Temperature_Celsius     0x0022   038   066   000    Old_age   Always       -       38 (0 8 0 0 0)
197 Current_Pending_Sector  0x0012   001   001   000    Old_age   Always       -       37904
198 Offline_Uncorrectable   0x0010   001   001   000    Old_age   Offline      -       37904
199 UDMA_CRC_Error_Count    0x003e   200   199   000    Old_age   Always       -       68
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       19597h+41m+10.668s
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       57999941039
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       1327164476473

SMART Error Log Version: 1
ATA Error Count: 3937 (device log contains only the most recent five errors)
    CR = Command Register [HEX]
    FR = Features Register [HEX]
    SC = Sector Count Register [HEX]
    SN = Sector Number Register [HEX]
    CL = Cylinder Low Register [HEX]
    CH = Cylinder High Register [HEX]
    DH = Device/Head Register [HEX]
    DC = Device Command Register [HEX]
    ER = Error register [HEX]
    ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 3937 occurred at disk power-on lifetime: 32223 hours (1342 days + 15 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 53 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 80 ff ff ff 4f 00   2d+19:53:57.102  READ FPDMA QUEUED
  60 00 80 ff ff ff 4f 00   2d+19:53:57.101  READ FPDMA QUEUED
  60 00 80 ff ff ff 4f 00   2d+19:53:57.101  READ FPDMA QUEUED
  60 00 80 ff ff ff 4f 00   2d+19:53:57.101  READ FPDMA QUEUED
  60 00 80 ff ff ff 4f 00   2d+19:53:57.100  READ FPDMA QUEUED

Error 3936 occurred at disk power-on lifetime: 32222 hours (1342 days + 14 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 53 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 80 ff ff ff 4f 00   2d+18:50:34.589  READ FPDMA QUEUED
  60 00 80 ff ff ff 4f 00   2d+18:50:34.588  READ FPDMA QUEUED
  60 00 80 ff ff ff 4f 00   2d+18:50:34.588  READ FPDMA QUEUED
  60 00 80 ff ff ff 4f 00   2d+18:50:34.587  READ FPDMA QUEUED
  60 00 80 ff ff ff 4f 00   2d+18:50:34.587  READ FPDMA QUEUED

Error 3935 occurred at disk power-on lifetime: 32221 hours (1342 days + 13 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 53 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 80 ff ff ff 4f 00   2d+18:04:18.861  READ FPDMA QUEUED
  60 00 80 ff ff ff 4f 00   2d+18:04:18.861  READ FPDMA QUEUED
  60 00 80 ff ff ff 4f 00   2d+18:04:18.861  READ FPDMA QUEUED
  60 00 80 ff ff ff 4f 00   2d+18:04:18.860  READ FPDMA QUEUED
  60 00 80 ff ff ff 4f 00   2d+18:04:18.860  READ FPDMA QUEUED

Error 3934 occurred at disk power-on lifetime: 32221 hours (1342 days + 13 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 53 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 80 ff ff ff 4f 00   2d+17:50:52.906  READ FPDMA QUEUED
  60 00 80 ff ff ff 4f 00   2d+17:50:52.906  READ FPDMA QUEUED
  60 00 80 ff ff ff 4f 00   2d+17:50:52.905  READ FPDMA QUEUED
  60 00 80 ff ff ff 4f 00   2d+17:50:52.905  READ FPDMA QUEUED
  60 00 80 ff ff ff 4f 00   2d+17:50:52.905  READ FPDMA QUEUED

Error 3933 occurred at disk power-on lifetime: 32219 hours (1342 days + 11 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 53 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 80 ff ff ff 4f 00   2d+15:56:40.578  READ FPDMA QUEUED
  60 00 80 ff ff ff 4f 00   2d+15:56:40.016  READ FPDMA QUEUED
  60 00 80 ff ff ff 4f 00   2d+15:56:39.728  READ FPDMA QUEUED
  60 00 80 ff ff ff 4f 00   2d+15:56:36.988  READ FPDMA QUEUED
  60 00 80 ff ff ff 4f 00   2d+15:56:36.988  READ FPDMA QUEUED

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

-----------------------------------------------
Kamil Maciorowski
  • 69,815
  • 22
  • 136
  • 202
Irfan
  • 11
  • 4
  • If you already made a copy, why not just... copy the copy? – Journeyman Geek Feb 20 '23 at 13:29
  • Yes, that's what I mean to do. Copy the copy. The first copy attempt with ddrescue is still running, it has been 2d 8 h and it has done 300 gb (15.53%) so far. When it finishes, I hope to make a second copy from the 1st copy. – Irfan Feb 20 '23 at 13:36
  • Ah! I assumed you'd set up ddrescue to do an *image* not a disk to disk copy. Would a file level copy suffice? – Journeyman Geek Feb 20 '23 at 13:47
  • ddrescue to an image file is much more useful. – harrymc Feb 20 '23 at 13:55
  • I have no idea at the moment, and doing this for the first time. My files (in original NTFS partition) are buried hidden under the ext4 partition (no data was written over, just partitioned and formatted once). One of the previous NTFS partitions (probably second partition) carries the photos which we want to recover. – Irfan Feb 20 '23 at 13:56
  • @Harrymc, disk manufacturers like your idea because it requires you to buy a disk that is bigger than the defective source. – r2d3 Feb 20 '23 at 15:13
  • "_it has been 2d 8 h and it has done 300 gb_" ... that averages about 1.5 MB/sec, which is incredibly low. Please make sure you're keeping an eye on `dmesg` and the disk's SMART data. It might also be worth looking into `--skip-size` and `--reverse` to see if skipping or approaching the problem area from the other side helps. – Attie Feb 20 '23 at 17:26
  • @Attie Thanks, and yes, the rate goes up and down but on average it is still slow. The output on screen in terminal does not indicate any bad areas (the number is still zero) but as I understand ddrescue skips bad sectors, so maybe that is why. Earlier, I have used TestDisk on my first try, and the process ran for over a week and covered only 30% so I stopped it. Compared to that, ddrescue looks faster to me. I will post the SMART data etc when I get home. – Irfan Feb 21 '23 at 06:49
  • @r2d3 well yes and no. You can still use the 'rest' of the disk with an image, and once recovery is done compress it to save space for copies, – Journeyman Geek Feb 21 '23 at 11:32
  • @JourneymanGeek Please tell me how can I rearrange all that text, just as you did. – Irfan Feb 21 '23 at 12:41
  • There's a few ways to - but I used 'code fences' - basically ``` before and after each code block. – Journeyman Geek Feb 21 '23 at 12:54
  • Please have a look at the logfile update. I am anxious if it is going ok. – Irfan Feb 23 '23 at 11:36
  • Please do not change the scope of the question after you got answers. Your original questions have been answered. "How is it going so far?" is a new distinct question. The site is not an interactive support service where threads may evolve. – Kamil Maciorowski Feb 23 '23 at 11:39
  • ok, Sorry, I am new to these forums. – Irfan Feb 23 '23 at 11:44
  • No harm done. If you need help beyond the original scope, you can ask a new question; but "how is it going so far?" is not a good question. In general try to make your questions useful for future users with similar problems. The question above is fine in this matter; "how is it going?" is not. A specific concern like "is it normal that `ddrescue` takes so much time?" may be, but note that [similar questions already exist](https://superuser.com/q/1713757/432690), check them first. – Kamil Maciorowski Feb 23 '23 at 12:07

2 Answers2

2

What if my second target drive is few MBs smaller than the intermediate drive? Will ddrescue fail to write the data on the second drive?

It will, but only when it tries to write beyond the size of the second drive. Normally the first pass is done in the forward direction, so if there are no read errors then at the moment of the write error the whole second drive will have been rewritten.

Or will it stop writing after the original 2TB data ends on the first copy drive?

It will not. As far as ddrescue is concerned, on the first copy drive there will be no indication where the copy ends. You can treat each drive as a linear sequence of bytes; each sequence has its own length, it's the size of the respective device. Copying (a part of) one sequence to (a part of) another sequence does not change the length of the latter. Drives are not like regular files in this matter, you cannot truncate them easily.

What shall I do in either case?

Something else in the first place. Copy only the part you need. E.g. you can use -s when copying with ddrescue from the intermediate drive to the second drive:

-s bytes
--size=bytes
Maximum size of the rescue domain in bytes. It limits the amount of input data to be copied. […] If ddrescue can't determine the size of the input file, you may need to specify it with this option. […]

(source)

You should specify the exact size of the original drive or a larger number. If there is any doubt, use a larger number. If you use a larger number then ddrescue will try to copy some garbage beyond the 2TB data. The point is you want this garbage to be reasonably small. Copying without -s will get almost all the garbage from the intermediate drive, totally in vain.

Even if you provide the exact number, then after ddrescue finishes, the second drive will contain its own garbage beyond the 2TB data, because also in it there will be no indication where the copy ends.

Hopefully there will be no read errors at this stage. If so then you should treat the mapfile from the first stage (i.e. your sdc.log) as relevant for both copies.

If the original drive uses GPT then please see this answer to learn what to do to fix GPT on a copy, in case you choose to do this.


For future reference: in similar cases consider creating a filesystem and use ddrescue to write to a regular file inside the filesystem. If you did this for the intermediate drive and for the second drive then copying the copy would mean copying the regular file from one filesystem to the other; you could do this with cp, without worrying about sizes at all.

It's not too late to create a filesystem on the second drive and to copy from the intermediate drive to a regular file there. You will still need to use -s though, because on the intermediate drive there will be no indication where the copy ends.

Personally I prefer ddrescue-ing to a regular file in a filesystem that supports CoW (e.g. Btrfs). Then I can make a non-CoW copy (cp --reflink=never) for redundancy (to another filesystem if I want or need) and any number of CoW copies within the filesystem (cp --reflink=always). Among these CoW copies I treat one as immutable (chattr +i) and work with others, possibly with tools that modify data. This way, if anything goes wrong with modifications, I can always create a new CoW copy of the immutable one without straining the disk(s) and virtually immediately.

(Side note: Btrfs supports compression and few times I have successfully managed to store an image of a disk as a regular file on a smaller disk, and to work with it.)

The downside of regular files is you need some knowledge and tools to get to partitions stored within. I mean if your copy is e.g. on /dev/sdz then the OS will create sdz1, sdz2 etc. automatically when the disk is connected or upon partprobe (unless logical sector sizes don't match between the original and the copy, see this question and my answer there to see what the problem is); but if your copy is in a regular_file then you won't get regular_file1, regular_file2 etc. as partitions. Useful tools:

  • mount -o offset=… …
  • losetup -o … --sizelimit … …
  • kpartx …

Another downside is you cannot boot from a regular file. If you plan to try to boot from a copy then copying directly to a block device is way more reasonable.

Kamil Maciorowski
  • 69,815
  • 22
  • 136
  • 202
  • Thanks a lot for elaborating. I understood the first part to some extent, but the second part just went above my head totally. I do need to learn a lot, but it is fun doing it myself. Apparently, I would have another 10 to 12 days (while it is still busy with making the first copy) to search through and use proper syntax while making the second drive from the first copy. – Irfan Feb 21 '23 at 06:46
1

What if my second target drive is few MBs smaller than the intermediate drive?

It doesn't matter in your case as the defective source is 2TB and your sector by sector copy fits completely on your first 4 TB disk. If the second 4 TB disk is a little bit (a few MBs) smaller than your first one, your defective source will nevertheless be completely contained on your second duplicate.

By the way, the software is called Testdisk, not "test disk". When writing important keywords incorrectly you diminish the search abilities in this forum. I have corrected your posting.

Or will it stop writing after the original 2TB data ends on the first copy drive?

No. ddrescue does not know how you created your first copy. It does not try to interpret the data on the first copy. ddrescue will copy all 4 TB.

r2d3
  • 3,298
  • 1
  • 8
  • 24