3

This is a follow-up question to this one.

So far, I've cobbled together the following find command, which is intended to print (and once properly tested, -delete) all JPGs that are 200 x 200px or lower:

find . -iname "*.jpg" -type f -exec bash -c 'for i; do size=($(identify -format "%wx%h" "$i")); (( size[1] < 200 && size[2] < 200 )); done;' \; -print

However, piping the command through wc -l indicates that it's selecting every image in the target set.

Breaking it down to the for loop itself shows that it's looping through images much larger than 200px:

for i in *.jpg; do size=($(identify -format "%wx%h" "$i")); (( size[1] < 200 || size[2] < 200 )); echo $size; done;

210x163    
1920x1200
1920x1200
240x240
246x138
215x215
1920x1200
1920x1200
240x240
240x240
1920x1200

To me this seems to indicate identify is probably the culprit here in failing to match only those images that are lower than the specified dimensions, but as far as I've been able to tell, the syntax for the matching is correct.

Anyone have an idea what could be causing this?

Hashim Aziz
  • 11,898
  • 35
  • 98
  • 166

1 Answers1

3

The find command derives from this answer. Now I see there is a bug there, you replicated it.

This is the bug: the fragment (( size[1] < 200 && size[2] < 200 )) should be (( size[0] < 200 && size[1] < 200 )) because arrays in Bash are indexed from 0. Note you need to use the space-separated format (-format "%w %h") used in the linked answer, otherwise the approach won't work.

Additionally you added few more bugs:

  • You need arbitrary-name {} before \;. (In the linked answer remove-files is the name and + is used instead of \;).
  • If you choose to use \; instead of +, there is no point in for i. If you choose +, there's little point in -print because -exec … {} + always returns true. So either

    • -exec … {} \;, no loop, -print makes sense; or
    • -exec … {} +, with loop, and what -print was supposed to do should be moved to the inside of the loop.

The for loop itself, as you presented it, seems syntactically valid; it suffers from the main bug though.

it's looping through images much larger than 200px

How can it know if the given image is larger, before it examines the image? It needs to loop through all images. You can run something conditionally, depending on whether the image is smaller or not. Your echo $size is not run conditionally.


Fixed code:

  • With -exec … \;

    find . -iname "*.jpg" -type f -exec bash -c '
       size=($(identify -format "%w %h" "$1"))
       (( size[0] < 200 && size[1] < 200 ))
    ' arbitrary-name {} \; -print
    
  • With -exec … +

    find . -iname "*.jpg" -type f -exec bash -c '
       for i; do
          size=($(identify -format "%w %h" "$i"))
          (( size[0] < 200 && size[1] < 200 )) && printf "%s\n" "$i"
       done
    ' arbitrary-name {} +
    

Note the former variant spawns one shell per file; the latter one is more economical.

Kamil Maciorowski
  • 69,815
  • 22
  • 136
  • 202
  • A great answer as always, but the given code doesn't seem to work - the identify process is running, but nothing is being printed. I would also prefer to use `\;` instead of `+` so it enables me to be able to use `-print` and then `-delete` when I need to delete the images. – Hashim Aziz Jan 02 '20 at 00:53
  • @Hashim Answer expanded. – Kamil Maciorowski Jan 02 '20 at 00:59
  • 1
    @Hashim `nothing is being printed` – Are you sure there are files that qualify? – Kamil Maciorowski Jan 02 '20 at 01:01
  • There are a few references to "the original answer"/"original bug" here and you seem to have new stuff at the top and old stuff at the bottom... it makes it quite hard to follow. Could you edit your answer into one single, cohesive narrative? – Lightness Races in Orbit Jan 02 '20 at 10:54
  • @LightnessRaceswithMonica Reasonable request, thank you. Answer edited. – Kamil Maciorowski Jan 02 '20 at 11:41
  • Apologies for the late reply, I had to do some cleaning up of all the corrupted images in the folder to properly test, can now confirm that the code works perfectly. Two clarifications: a) if I wanted to check for images that had a height *or* width of 250px or less, would simply changing to `size[0] < 200 || size[1] < 200` work? b) What exactly is `arbitrary-name` doing here? I'm having trouble understanding which command requires it as a parameter. – Hashim Aziz Jan 02 '20 at 19:29
  • @Hashim (a) If you're asking about changing `&&` to `||` then yes; but `size[0] < 200` is not "250px or less". (b) See `man 1 bash` where it describes `-c`. `arbitrary-name` becomes `$0` "used in warning and error messages". The POSIX shell (`sh`) also works this way. Therefore names like `find-bash-1`, `find-sh-2` or `special-shell` may be useful: when you get an error with such name, you immediately can tell which "inner" shell generated it. – Kamil Maciorowski Jan 02 '20 at 19:37
  • Yes, my bad, the 250px was a mistake. So arbitrary-name is a requirement when using using `bash -c`? – Hashim Aziz Jan 02 '20 at 19:51
  • 1
    @Hashim Not a requirement per se. But if you want `find` to pass expanded `{}` to a shell invoked with `-exec` then [you should pass it as a separate argument](https://unix.stackexchange.com/a/156010/108618) and refer to it as `$1` or so in the shell command. Technically you can omit arbitrary name, let `find` pass expanded `{}` right away and refer to it as `$0` in the shell command. But then in case of any error or warning the expanded `{}` will be used as the name of the shell, it can be very misleading. For this reason it's good to use any fixed name and then `{}` (which will become `$1`). – Kamil Maciorowski Jan 02 '20 at 19:52
  • 1
    @Hashim When I say `{}` becomes `$1`, I mean the `-exec sh 'shell-code' foo {} \;` case. If we use `-exec sh 'shell-code' foo {} +` then `find` will expand `{}` to possibly *multiple* arguments. They will become `$1`, `$2`, … of the inner shell. – Kamil Maciorowski Jan 02 '20 at 19:57
  • Coming back to clarify something regarding the last two snippets and their differences - if the latter snippet uses less resources, does that mean it's also slower than the former snippet? – Hashim Aziz Jan 14 '20 at 00:10
  • @Hashim I expect the latter snippet to be faster in almost all cases, because it doesn't spawn `bash` so many times. If there is just one file to test with `-exec`, then the former snippet may be little (negligibly?) faster. Why won't you test them? – Kamil Maciorowski Jan 14 '20 at 00:40
  • @Hashim About `arbitrary-name`: I created a separate [question](https://superuser.com/q/1526229/432690) and answered it decently. – Kamil Maciorowski Feb 17 '20 at 22:56