Shell script to fix bad filenames?

Question

I'm IT at my small firm; and, despite my dire warnings, everyone puts files on the server with awful names, including leading & trailing spaces, bad characters (including \ ; / + . < > - etc!)

They do this by accessing the (FreeBSD/FreeNAS) server via AFP on Macs, so no part of the system complains.

Is there a script I can use to go through an entire directory tree and fix bad filenames?

Basically replace all spaces & bad ASCII with _ ... and if a file already exists, just slap a _2 or something on the end.

I don't suppose there's a way to get the system to enforce good filenaming conventions, is there?

Thanks!

You can't tell netatalk or whatnot to do this? – Ignacio Vazquez-Abrams Aug 20 '12 at 01:52 — Ignacio Vazquez-Abrams, Aug 20 '12 at 01:52

terdon · Accepted Answer · 2012-08-21T02:27:26.323

I'd use bash and find. I'm sure there's a simpler option but here's what I came up with:

This can deal with file names containing "/" (find will give a warning, ignore it), but it will only work on files in the current directory (no subdirectories). I couldn't figure out how to tell bash or find to differentiate between a "/" in a file name and a "/" that is part of the path.
```
for i in $(find . -maxdepth 1 -type f  -name "*[\:\;><\@\$\#\&\?\\\/\%]*" | sed 's/\.\///'); do mv "$i" ${i//[\;><\@\$\#\&\?\\\/\%]/_}; done
```
This one cannot deal with file names containing "/" but it will work on all files in the current directory and its subdirectories:
```
for i in $(find . -type f  -name "*[\:\;\>\<\@\$\#\&\?\\\%]*"); do mv "$i" ${i//[\;><\@\$\#\&\?\\\%]/_}; done
```

Make sure to test these before running. They worked fine in the few tests I ran, but I was not exhaustive. Also bear in mind that I am on a linux system. The particular implementation of find, and perhaps bash, may differ on yours.

EDIT: Changing the mv $i command to `mv -i $i‘ will cause mv to prompt you before overwriting an existing file.

EDIT2: To deal with filenames with spaces, you can change the bash IFS (Input Field Separator) variable like so (adapted from here):

SAVEIFS=$IFS; IFS=$(echo -en "\n\b"); for i in $(find . -type f  -name "*[\:\;\>\<\@\$\#\&\(\)\?\\\%\ ]*"); do mv "$i" ${i//[\;><\@\$\#\&\(\)\?\\\%\ ]/_}; done; IFS=$SAVEIFS

I also modified the regular expression to match/replace spaces with underscores. The SAVEIFS bit just returns the IFS variable to its original configuration.

EXPLANATION:

for i in $(command); do something $i; done

This is a generic bash loop. It will go through a command's output, sequentially setting variable $i to each of the values returned by command, and will do something to it.

find . -maxdepth 1 -type f  -name "*[\:\;><\@\$\#\&\(\)\?\\\/\%]*" '

Find all files in the current directory whose name contains one of the following characters: :;><@$#&()\/%. To add more, just escape them with "\" (eg "\¿") and add them to the list within the brackets ([ ]). Probably, not all these characters need to be escaped, but I can never remember which are special variables in which environment so I escape everything, just in case.

sed 's/\.\///

Remove the current directory from find's output, print "foo" instead of "./foo".

mv "$i" ${i//[\;><\@\$\#\&\(\)\?\\\/\%]/_}

Every time this little scipt loops, $i will be the name of a badly named file. This command will move (rename) that file changing all unwanted characters to "_". Look up bash substitution for more information.

This looks good, but (a) I had to change the `"..."` in the find expression to `'...'`; and (b) when I try to run the loop, it says `do: command not found.` Maybe a FreeBSD limitation? Or csh? — Dan, Aug 20 '12 at 19:23
@Ze'ev Yup, you definitely need bash for this. The substitution command is also bash specific. You can try and automate the file renaming process by introducing a variable that will be incremented every time you have 2 files with the same name. — terdon, Aug 20 '12 at 20:06
Using bash is fine; almost working. But there is a problem with filenames with a space in them. If I have a file called `./a/b/c/xxx$ 111`, then the find command returns `./a/b/c/xxx$` and `111` as two separate items in `$i`, and then the `mv` fails. — Dan, Aug 20 '12 at 20:09
The strange thing is that `find` on its own returns just the one file, including the space; but the `find` within the loop breaks it into 2 results. — Dan, Aug 20 '12 at 20:12
@Ze'ev Yeah, the problem is that bash, by default uses a space to separate entries. Bash does not like spaces in filenames. Hang on, I'll update the answer in a second. — terdon, Aug 20 '12 at 20:13
Something like: `Find . -name "*.model" -exec mv "{}" ``echo "{}" | sed 's/[^A-Za-z0-9_./]/_/g'`` \;` — Dan, Aug 20 '12 at 20:17
@Ze'ev I played around with that for this answer and couldn't get it to work. The problem was making find differentiate between slashes in a filename and actual paths. I also had some other complication I can't remember. If you can make it work, more power to you :) In any case, the updated answer should work. — terdon, Aug 20 '12 at 20:27

Shell script to fix bad filenames?

1 Answers1

Linked