0

As a part of my bash routine, I am trying to locate number of subdirectories located in the directory $storage and assosiate it to some variable, which will be used in the same script

number_dirs=$(ls -ld "${storage}"/* | wc -l)
  printf >&2 '%s is the number of the directories... ' "${number_dirs}" ;sleep 0.2
  printf >&2 "Keep calm!\n"

this works fine with the number of dirs around 2-4K but does not work with the huge number. How I could use find command in the same way instead?

Hot JAMS
  • 147
  • 6
  • 1
    In [your earlier question](https://superuser.com/q/1649889/332907) you seemed to be wanting to search directories in [the entire tree](https://superuser.com/questions/1649889/awk-compute-min-max-values-for-multi-column-data-using-big-number-of-input-file#comment2526657_1649895) underneath `$storage` (you were using `$storage/**/` type constructs). The question you've asked here (and the answers) are only counting directories immediately underneath `$storage` rather than directories underneath directories (etc.) underneath `$storage`. Is that intentional? – roaima May 25 '21 at 15:04
  • 1
    Yes, this is correct. Indeed, this time I just need to count the number of the directories located within $storage, thus ignoring possible subdirectories... so the proposed here answer works very well :-) – Hot JAMS May 26 '21 at 12:49

3 Answers3

1

A simple scan with find would be:

number_dirs=$(find ${storage} -maxdepth 1 -mindepth 1 -type d | wc -l)
DuncG
  • 532
  • 1
  • 3
  • 8
0
# store all the dir names in an array
dirs=( "${storage}"/*/ )

num_dirs=${#dirs[@]}

The trailing slash in the glob pattern restricts the results to only directories.

glenn jackman
  • 25,463
  • 6
  • 46
  • 69
0

Analysis

Note your ls -ld "${storage}"/* | wc -l is not restricted to directories; i.e. ls will list non-directories in $storage as well, wc will count them. Consider "${storage}"/*/ (note the trailing slash) that will list directories and symlinks to directories, unless there is no match. If there is no match the /*/ part will stay literal. Research what shopt -s failglob and shopt -s nullglob do.

Your command does not list deeper subdirectories, I assume this behavior is what you want.

Your command does not normally list hidden directories (with names starting with a dot). If you are interested in them, research shopt -s dotglob.

If "${storage}"/*/ expands to huge number of words then you will get argument list too long. This is because ls is an external executable that needs to be called with array of arguments and there's a limit for this.


Shell can do this

You can let the shell count directories for you:

(shopt -s nullglob && set -- "${storage}"/*/ && echo "$#")

set -- will store all the names as positional parameters, then echo "$#" will print the number. set is a builtin, it should not suffer from argument list too long. I deliberately used a subshell so the positional parameters of the current shell are not affected. I could use a separate array (like this other answer does), but there's a potential problem. Bash allows you to set a huge number of positional parameters. On one hand it's good, since you want to count a huge number of directories. On the other hand the parameters need to be stored in memory as strings, even if you only want their count.

Note if you do this:

number_dirs=$(shopt -s nullglob && set -- "${storage}"/*/ && echo "$#")

then the inside of $() is executed in a subshell anyway, so you don't need additional parentheses. You may choose to use an array inside the subshell. In any case the subshell will exit immediately after echo and the memory will be released as soon as possible. But if you do this:

dirs=( "${storage}"/*/ )
num_dirs=${#dirs[@]}

and if you don't unset dirs then the array will become a burden. A subshell that dies right away and releases the memory is better than a memory-consuming array you may forget to unset.

One way or another all the expanded words need to be stored (if only temporarily in a subshell) before you get their count. For this reason you may prefer find that allows wc to count on the fly.


Or find + wc

A basic solution is in yet another answer where the main point is -type d. The idea (with proper quoting):

find "$storage" -maxdepth 1 -mindepth 1 -type d | wc -l

may fail because:

  • -maxdepth and -mindepth are not portable;
  • directory names may contain newline characters that will make wc -l miscount.

If your find supports -printf (so it most likely supports -maxdepth and -mindepth) then you can solve the latter problem by printing single bytes and counting them:

find "$storage" -maxdepth 1 -mindepth 1 -type d -printf a | wc -c

where a is an arbitrary one-byte character.

A portable (but somewhat slower) solution is like this:

( cd -- "$storage"/ && find . -type d ! -name . -prune -exec sh -c '
    for i do printf a; done
' find-sh {} + | wc -c )

The subshell makes cd not affect the current working directory of the current shell. I used cd -- "$storage"/ first because it allows me to refer to $storage as . later and therefore easily exclude it without pruning. Without cd I would need to use something like ! -path "$storage", but then find would interpret $storage as a pattern; so this could fail in general.

On the other hand if you are permitted to read the directory but not to execute (cd into, see this answer) then cd will not work, while solutions without cd may.


Notes

  • In cd -- "$storage"/ the double dash and the trailing slash make paths like -foo or even - be interpreted as actual pathnames.

  • find-sh is explained here: What is the second sh in sh -c 'some shell code' sh?.

  • If you don't want to count hidden directories, add ! -name '.*' after -prune.

  • -type d does not match symlinks to directories. For comparison: shell globbing patterns like */ do match them.

  • If you want to descend into subdirectories and count the entire subtree, omit -prune.

  • shopt is not portable. I know no straightforward way to do in pure sh what shopt -s nullglob does in Bash. Your question is tagged , so this should not be a problem.

Kamil Maciorowski
  • 69,815
  • 22
  • 136
  • 202