1

I'm IT at my small firm; and, despite my dire warnings, everyone puts files on the server with awful names, including leading & trailing spaces, bad characters (including \ ; / + . < > - etc!)

They do this by accessing the (FreeBSD/FreeNAS) server via AFP on Macs, so no part of the system complains.

Is there a script I can use to go through an entire directory tree and fix bad filenames?

Basically replace all spaces & bad ASCII with _ ... and if a file already exists, just slap a _2 or something on the end.

I don't suppose there's a way to get the system to enforce good filenaming conventions, is there?

Thanks!

Dan
  • 2,240
  • 4
  • 36
  • 43

1 Answers1

3

I'd use bash and find. I'm sure there's a simpler option but here's what I came up with:

  1. This can deal with file names containing "/" (find will give a warning, ignore it), but it will only work on files in the current directory (no subdirectories). I couldn't figure out how to tell bash or find to differentiate between a "/" in a file name and a "/" that is part of the path.

    for i in $(find . -maxdepth 1 -type f  -name "*[\:\;><\@\$\#\&\(\)\?\\\/\%]*" | sed 's/\.\///'); do mv "$i" ${i//[\;><\@\$\#\&\(\)\?\\\/\%]/_}; done
    
  2. This one cannot deal with file names containing "/" but it will work on all files in the current directory and its subdirectories:

    for i in $(find . -type f  -name "*[\:\;\>\<\@\$\#\&\(\)\?\\\%]*"); do mv "$i" ${i//[\;><\@\$\#\&\(\)\?\\\%]/_}; done
    

Make sure to test these before running. They worked fine in the few tests I ran, but I was not exhaustive. Also bear in mind that I am on a linux system. The particular implementation of find, and perhaps bash, may differ on yours.


EDIT: Changing the mv $i command to `mv -i $i‘ will cause mv to prompt you before overwriting an existing file.

EDIT2: To deal with filenames with spaces, you can change the bash IFS (Input Field Separator) variable like so (adapted from here):

SAVEIFS=$IFS; IFS=$(echo -en "\n\b"); for i in $(find . -type f  -name "*[\:\;\>\<\@\$\#\&\(\)\?\\\%\ ]*"); do mv "$i" ${i//[\;><\@\$\#\&\(\)\?\\\%\ ]/_}; done; IFS=$SAVEIFS

I also modified the regular expression to match/replace spaces with underscores. The SAVEIFS bit just returns the IFS variable to its original configuration.


EXPLANATION:

for i in $(command); do something $i; done

This is a generic bash loop. It will go through a command's output, sequentially setting variable $i to each of the values returned by command, and will do something to it.


find . -maxdepth 1 -type f  -name "*[\:\;><\@\$\#\&\(\)\?\\\/\%]*" '

Find all files in the current directory whose name contains one of the following characters: :;><@$#&()\/%. To add more, just escape them with "\" (eg "\¿") and add them to the list within the brackets ([ ]). Probably, not all these characters need to be escaped, but I can never remember which are special variables in which environment so I escape everything, just in case.

sed 's/\.\///

Remove the current directory from find's output, print "foo" instead of "./foo".

mv "$i" ${i//[\;><\@\$\#\&\(\)\?\\\/\%]/_}

Every time this little scipt loops, $i will be the name of a badly named file. This command will move (rename) that file changing all unwanted characters to "_". Look up bash substitution for more information.


terdon
  • 52,568
  • 14
  • 124
  • 170
  • This looks good, but (a) I had to change the `"..."` in the find expression to `'...'`; and (b) when I try to run the loop, it says `do: command not found.` Maybe a FreeBSD limitation? Or csh? – Dan Aug 20 '12 at 19:23
  • It was csh ... works better after I `exec bash`! – Dan Aug 20 '12 at 19:54
  • @Ze'ev Yup, you definitely need bash for this. The substitution command is also bash specific. You can try and automate the file renaming process by introducing a variable that will be incremented every time you have 2 files with the same name. – terdon Aug 20 '12 at 20:06
  • Using bash is fine; almost working. But there is a problem with filenames with a space in them. If I have a file called `./a/b/c/xxx$ 111`, then the find command returns `./a/b/c/xxx$` and `111` as two separate items in `$i`, and then the `mv` fails. – Dan Aug 20 '12 at 20:09
  • The strange thing is that `find` on its own returns just the one file, including the space; but the `find` within the loop breaks it into 2 results. – Dan Aug 20 '12 at 20:12
  • @Ze'ev Yeah, the problem is that bash, by default uses a space to separate entries. Bash does not like spaces in filenames. Hang on, I'll update the answer in a second. – terdon Aug 20 '12 at 20:13
  • Hey, couldn't I use the `-exec` option of `find` ? – Dan Aug 20 '12 at 20:14
  • Something like: `Find . -name "*.model" -exec mv "{}" ``echo "{}" | sed 's/[^A-Za-z0-9_./]/_/g'`` \;` – Dan Aug 20 '12 at 20:17
  • 1
    @Ze'ev I played around with that for this answer and couldn't get it to work. The problem was making find differentiate between slashes in a filename and actual paths. I also had some other complication I can't remember. If you can make it work, more power to you :) In any case, the updated answer should work. – terdon Aug 20 '12 at 20:27