Halt the 'shred' command in bash script if it encounters errors

Question

I have a workstation that we have set up to sanitize multiple hard drives. I run a script that detects the hard drives and then runs the 'shred' command on each one. The problem is, if any of the hard drives fail (where Ubuntu no longer sees the drive) while 'shred' is running, rather than halting, 'shred' will output into infinity with line after line of this:

shred: /dev/sdd: error writing at offset 103456287104: Input/Output error

I don't see any options for 'shred' to enable it to exit if it encounters errors, and I obviously don't want the script to just run forever with the I/O error. Since 'shred' won't halt on it's own when it encounters this error, something else would have be running in parallel to do some kind of error-checking. In my script, I have the verbose output of 'shred' redirected to a log file, and I actually use that log file to check for successful completion of 'shred' in another part of the script. But I'm not sure how to continuously check that log file while 'shred' is still running.

Anyone have any ideas for how I can accomplish this kind of "parallel error-checking?"

I know the 'wipe' command exits when it detects I/O errors, but for reasons beyond our control, we are limited to using 'shred'. It's kind of frustrating that 'shred' doesn't do the same. It would seem like a no-brainer to have it halt upon error, but.....it doesn't.

This is the "shredding" part of my script:

#!/bin/bash
log=/root/sanilog.txt
## get disks list
drives=$(lsblk -nodeps -n -o name |grep "sd")
for d in $drives; do
     shred -n 3 -v /dev/$d >> $log 2>&1
done

Would it be feasible for you to have a[nother] program watching the output via `tee`, and when it detects errors stop the process? - Have you decided what to do with the drive, when you encounter errors (mark the bad sectors and try again with shred or some other software tool, or make the drive unreadable by physical means)? — sudodus, Dec 12 '19 at 18:36
Since shred can take MANY hours to run, depending on the size the drive, all I'm really concerned with at the moment is being able to detect a drive failure in a timely fashion so it can be removed, and the script re-ran again. The failed drive can be troubleshot offline. I'm running this "sanitizing" workstation from a Clonezilla Live CD running custom scripts from the command line. I'm not entirely sure how to run a[nother] program in parallel with the shred command. — Knightshift, Dec 12 '19 at 20:00
Maybe like this: With the text screen of Clonezilla I think you can start the other program in the background with `&` and have it check maybe every tenth second for the tail of the logfile, and if there is an error output, stop. Then start your script with shred. — sudodus, Dec 12 '19 at 21:11

score 1 · Answer 1 · edited Oct 26 '20 at 13:12

I'm making a script to use shred.

I have the same problem of you, but its possible to do that:

# My device become the variable "$dev"
dev=/dev/sdd

# Running Shred using my variable and put the output 2 & 1 (msg and error) in a file, then put & to launch in background
# The log file will be shred_sdd.log
# ${dev:5} means /dev/sdd without the first 5 characters, because '/' is nod good in a file name
shred -fvzn1 "$dev" >"shred_${dev:5}.log" 2>&1 &

# So while the pid concerning sdd is running, check word 'error' in the the shred_sdd.log, if yes then write a message and kill the PID, else wait 60 sec before re-checking 
while ps aux|grep "shred.*${dev}"|grep -v 'grep' >/dev/null; do
< "shred_${dev:5}.log" grep -i 'error' >/dev/null
if [ $? = 0 ]; then
  echo "Too much sector defect on the device ${dev}, shred can not continue"
  PID=$( ps aux|grep "shred.*${dev}"|grep -v 'grep'|awk '{print $2}' )
  kill -9 "$PID"
  break        
else
  sleep 60
fi
done

You can use a fonction to do the same task with all devices

# Variables of my devices (You can use lsblk, depends your configuration)
devices=$(lsscsi -t | grep disk | grep sas | awk '{print $NF}')

function_shred() {
# Put the code here that I wrote previously
}

for dev in $devices; do
function_shred &
done

If you have a question do not hesitate (sorry for my english xD) — MaTTiMoT, Oct 26 '20 at 12:22

RASG · Answer 2 · 2020-02-18T16:05:17.547

0

set -e at the top of a bash script will cause the script to exit if any commands return a non-zero exit code.

you can also try a EXIT trap to make the script clean up after itself

if don't mind trying an alternative do shred, dd can do a similar job:

dd if=/dev/urandom of=/dev/sdd bs=4096

give it two passes if you're paranoid :)

and ubuntu also have wipe

wipe /dev/sdd

wipe repeatedly overwrites special patterns to the files to be destroyed, using the fsync() call and/or the O_SYNC bit to force disk access. In normal mode, 34 patterns are used (of which 8 are random).

edited Feb 18 '20 at 16:05

answered Feb 05 '20 at 18:18

RASG

246
1
2
10

1

Thanks for the suggestion, but I tried using set -e, and it didn't stop 'shred' from halting when it encountered the I/O error. – Knightshift Feb 17 '20 at 17:17
@Knightshift updated my answer with alternatives that you could try. maybe one will work for you. – RASG Feb 18 '20 at 16:06
`set -e` does not work here, because `shred` itself does not exit. – pLumo Apr 12 '20 at 09:39

Halt the 'shred' command in bash script if it encounters errors

2 Answers2