4

In the coming months, I'm going to need to zero out a lot of disks. After wiping each drive, I need a quick way of making sure that the drive has been completely filled with zeroes.

I could open each one in a hex editor, but all this does is allow me to see that certain parts of it have been zeroed, which is increasingly pointless the bigger a drive gets, as it doesn't verify for sure that no non-zero characters exist on it.

I decided to run some benchmarks to test a few tools that I came across. I timed each tool in a series of 3 separate runs verifying the wipe of the same 1TB disk, with each run executing overnight at the same system load. To deal with caching, each run executed the tools at randomised positions, with a sleep of at least 500 seconds between each.

Below is each tool's average run across the 3 tests, sorted from slowest to fastest.

From myself:

time hexdump /dev/sda

0000000 0000 0000 0000 0000 0000 0000 0000 0000
*
e8e0db6000

real    284m35.474s
user    223m4.261s
sys     2m49.729s

From Gordon Davisson:

time od /dev/sda

0000000 000000 000000 000000 000000 000000 000000 000000 000000
*
16434066660000

real    148m34.707s
user    77m10.749s
sys     2m54.611s

From Neal:

time cmp /dev/zero /dev/sda 

cmp: EOF on /dev/sda

real    137m55.505s
user    8m9.031s
sys     3m53.127s

From Beardy:

time badblocks -sv -t 0x00 /dev/sda

Checking blocks 0 to 976762583
Checking for bad blocks in read-only mode
Testing with pattern 0x00: done
Pass completed, 0 bad blocks found. (0/0/0 errors)

real    137m50.213s
user    5m19.287s
sys     4m49.803s

From Hennes:

time dd if=/dev/sda status=progress bs=4M | tr --squeeze-repeats "\000" "D"

1000156954624 bytes (1.0 TB, 931 GiB) copied, 8269.01 s, 121 MB/s
238467+1 records in
238467+1 records out
1000204886016 bytes (1.0 TB, 932 GiB) copied, 8269.65 s, 121 MB/s
D
real    137m49.868s
user    27m5.841s
sys     28m3.609s

From Bob1:

time iszero < /dev/sda

1000204886016 bytes processed
0 nonzero characters encountered.

real    137m49.400s
user    15m9.189s
sys     3m28.042s

Even the fastest of the tools tested seem to cap out at the 137 minute mark, which is 2 hours and 16 mins, whereas a full wipe of the disk averages just 2 hours and 30 minutes.

This is what prompted me to ask this question - it seems like it should be possible for such a tool to be at least half the speed it takes to wipe a drive, given that the disk only needs to be read from and not written to.

Does an alternative, faster solution to the above exist?

In an ideal world the solution I'm looking for would read the entire disk and print any non-zero characters it finds, just like Bob's C++ program. This would allow me to go back and selectively wipe any non-zero bytes rather than the entire disk. However, this wouldn't be a strict requirement if the tool was very fast at reading the disk.


1. This is a C++ program written by Bob, with the buffer size increased to 4194304 (4 MiB) and compiled with:

g++ -Wl,--stack,16777216 -O3 -march=native -o iszero iszero.cpp
Hashim Aziz
  • 11,898
  • 35
  • 98
  • 166
  • 3
    Why do you think reading is faster than writing? – Daniel B Dec 15 '19 at 20:55
  • 1
    @DanielB Because that's usually the case for storage hardware, whether it's disk or RAM. Reading from is always faster than writing to. – Hashim Aziz Dec 15 '19 at 20:59
  • 3
    `I'm going to need to zero out a lot of disks` – The best way to make this process fast is to work with as many disks as possible in parallel. – `I need a quick way of making sure that the drive has been completely filled with zeroes` – After you write zeros? Then I understand you don't trust the firmware/hardware/software/OS. Normally when you write zeros and there is no error, you *do* write zeros. Can you elaborate? – Kamil Maciorowski Dec 15 '19 at 21:13
  • I'm unsure why this question is generating so much confusion. I understand how the process of wiping drives works and I'm aware of how to make that process faster. My current hardware limits me to connecting one drive at a time, upgrading is out of the question, and most importantly, that's not what this question is asking. Verification that a drive was properly wiped is a standard stage of wiping for both security and regulation purposes, because there are many things that can potentially go wrong with software/hardware/firmware, and that is the question being asked here. – Hashim Aziz Dec 15 '19 at 21:23
  • 1
    The action of getting data from a device where mechanical parts are involved will always be comparatively slow (e.g. SSD is a lot faster). Your 137 minutes for reading with dd is probably the among the fastest speed you can get; it might depend on which type of disk you are accessing to some extent (e.g. 5k, 7k and 10k rpm disks are likely to differ some, the same with 3 or 6 gbps SATA - All of this if the disks involved can SUSTAIN the indicated speed, not just deliver correctly predicted buffer contents in bursts). – Hannu Dec 15 '19 at 23:07
  • What are the specs of the disks, and what kind of connection is used? It looks like your transfer speed is 121 MByte per second, this is most likely a hardware limit, and your choice of tool (among the faster ones) can't make a difference. – Hans-Martin Mosner Dec 16 '19 at 06:47
  • @grawity I'm curious, how come you deleted your answer? It seemed to be the closest thing to answering the question than anyone else had done. – Hashim Aziz Dec 16 '19 at 17:09
  • @Hans-MartinMosner It's via SATA 3 Gb/s (i.e. a 375MB theoretical limit), so I doubt the interface is the issue. – Hashim Aziz Dec 16 '19 at 17:11
  • 1
    you're reading a full 1TB of data (every byte) across a SATA 3Gb/s bus, which has its own signaling overhead so you're not going to get 3Gb/s of data. And if they're mechanical hard drives you probably can't even saturate the data bus due to real-world physical limits. Since you really want to read every byte rather than use a statistical sampling of bytes, then the 180 Mb/s you seem to be getting is around the limit for older mechanical drives (which I assume these are because you suggest you're taking them out of service). – simpleuser Dec 17 '19 at 00:30
  • using dd without count which let dd write until the disk is full, like "sudo dd if=/dev/zero of=/dev/hda1 bs=1024k“ and use killall to show the process like "sudo watch -n 5 killall -USR1 dd", when dd completes, check out there is a "disk full" like message at the end and check out the writing process information triggered by killall is about the size of the drive. With all these information, the chance that the disk is not full zero is almost zero. There is no need to read back at all. If you really care about security that much, destroy the disk and by a new one which is very cheap. – jw_ Dec 18 '19 at 02:53
  • Are you sure you want to be zero'ing out the drive. Its not a lot more work to write pseudo-random stuff to it (ie shred -n1), and it makes some theoretical attacks on the data much harder -Derek answer https://askubuntu.com/questions/21501/possibility-of-recovering-files-from-a-dd-zero-filled-hard-disk – davidgo Jun 07 '20 at 00:26
  • @davidgo I've seen Derek's answer before - it's from 2012, and more importantly I've seen no evidence in the extensive research I've done on this topic to indicate it's true. To date no-one has ever proven to have recovered a single bit from a modern (post-2000s) zeroed drive, let alone anything of value. Equally important for this question, a drive that's been randomly wiped (pseudo or not) can't be verified, and a failed wipe command (especially when automated) is a much more real possibility than recovering data from a zeroed drive. – Hashim Aziz Jun 07 '20 at 00:33
  • @Prometheus Fair enough. (My thinking was along a different line starting from wiping a drive by removing encryption keys in an fde device. Of-course none of this helps where there are bad sectors / overpartitioning if you didn't start off with fde). – davidgo Jun 07 '20 at 01:32

1 Answers1

2

The read and write speeds of magnetic hard disks are approximately the same. The same is true of tape drives, RAM, CD-/DVD-/BD-R, and even floppy disks. With spinning media, its mainly a function of how fast the data moves under the heads (or laser assemblies for optical drives). If read and write didn't go at the same speed, you'd have to spin up (or down) the media to change from read to write and back.

Significantly faster read than write is a flash memory thing.

derobert
  • 4,252
  • 1
  • 24
  • 20
  • 1
    This seems to be the gist of what I found from my research, and based on that I don't think I'll be able to find a tool to do this any faster than 137 mins. Thanks for confirming this by putting it into clear terms. – Hashim Aziz Dec 20 '19 at 02:28