3

When I deploy my software, I ship a zipped file to the target server and extract its contents. In addition to this, at the same time I also place a metadata file in the directory, detailing what was deployed.

If I want to find any files that have been changed since I deployed the software, I can simply find files that have a new modification time than the metadata file:

find . -newer deployment_metadata.txt

That's nice and straight-forward.

Now, I'd like to also find files that are old than the deployment metadata file. One would assume you could use the bang symbol to negate the "newer" check

find . ! -newer deployment_metadata.txt

But 'not newer' is not quite equivalent to 'older', as any files with the same timestamp are also "not newer" — so the command also includes all the files that I just deployed!

So, I was wondering if I was missing a trick when it comes to finding (strictly) old files?

My current solution is to create a new file (in the temp dir) using touch which has a modification time of one minute before the deployment_metadata.txt file. Then I am able to use the following arguments: ! -newer /var/tmp/metadata_minus_1.

This works, but seems like a waste of time to have to create, and then clean up, the file in the temp dir - especially as different users may be using my script to check for this (don't want file ownership problems, so I actually go as far as appending ${USER} to the filename.

Ender
  • 115
  • 4
jwa
  • 253
  • 3
  • 9
  • 1
    Not sure if it's going to make much difference, but I should have really attached the version of `find` I'm using `find --version GNU find version 4.2.27 Features enabled: D_TYPE O_NOFOLLOW(enabled) LEAF_OPTIMISATION SELINUX` – jwa Aug 22 '13 at 10:16

4 Answers4

2

Maybe pipe the find output into a loop of the test command which will let you use an "older-than" test:

find ... | while read file;
do
  [ "$file" -ot deployment_metadata.txt ] && echo "$file"
done
njd
  • 11,058
  • 3
  • 39
  • 36
2

One way is to (ab)use epoch time. Here is a test run where I first create seven files in sequence in an empty directory, where the c# files get "the same" ctime as far as find will be concerned:

$ for i in a b "c1 c2 c3" d e; do touch $i; sleep 1; done
$ find -newer c2
.
./d
./e
$ find -not -newer c2
./c3
./c2
./a
./b
./c1
$ find -newerct @$(($(stat -c %Z c2)-1))
.
./c3
./d
./c2
./e
./c1
$ find -not -newerct @$(($(stat -c %Z c2)-1))
./a
./b

This should represent all possible sets of ctime relative to c2:

  1. ctime > c2
  2. ctimec2
  3. ctimec2
  4. ctime < c2

with somewhat fuzzy matching, at least.

The third command gets epoch ctime for the file c2, subtracts 1 via shell arithmetic and feeds this as reference to -newerct (the @ is needed for find to interpret is as such a timestamp) to find all files with ctime newer than this interpreted timestamp (see -newerXY in man find). The fourth command negates this match, and should in practice do what you want if I've understood the question correctly, if you put your reference file as c2 in my example.

Note that the "1 second" offset is somewhat arbitrary (which is what I meant by "fuzzy matching"), and one could imagine a situation where a bug could be constructed. However, timestamps of files are not "definite" anyway and can not be trusted to be, so I can't imagine it to generate either security or practical problems in real situations.

Actually, in practice you might even want to increase the 1 second offset (I see in your question that you use 1 minute right now), but that is an implementation detail.

Daniel Andersson
  • 23,895
  • 5
  • 57
  • 61
0

I wanted to find all files older than an existing file, and following the accepted solution:

find . -type f -not -newer spec-file

This includes 'spec-file' in the results, which makes it not correct for removing the file results.

I used the following to find the newest file I wanted to remove:

find . -type f -not -newer spec-file -printf '%T+  %p\n' | sort | tail -3

And then used the second to last file to remove them:

find . -type f -not -newer result-file -exec rm {} +
codeDr
  • 123
  • 4
0

Preliminary note

The purpose of this answer is to provide code that is strict and portable (for comparison: find … -newerct … from this answer and [ … -ot … from this answer are not portable).


Basic solution

For every file that passes your ! -newer deployment_metadata.txt, check if deployment_metadata.txt is -newer:

find . ! -newer deployment_metadata.txt -exec sh -c '
   [ -n "$(find deployment_metadata.txt -prune -newer "$1")" ]
' find-sh {} \; -print

Notes

  • find-sh is explained here: What is the second sh in sh -c 'some shell code' sh?

  • -prune is in case deployment_metadata.txt is a directory.

  • [ -n "$(find …)" ] converts the non-empty or empty output from the inner find into exit status 0 or 1 respectively, this becomes the exit status of sh; then -exec of the outer find evaluates as true or false respectively.

  • We want to know if the resulting string is empty or not, it can either be empty or be exactly the name of the reference file (deployment_metadata.txt in our case). If the name of the reference file contained newline characters only, remember $(…) strips all trailing newlines; in this case our test would not be able to tell the two possibilities apart. To solve this problem you should supply a broader path, e.g. ./…. The problem occurs only for a basename consisting of nothing but newline character(s); in practice you don't need to care, unless you deliberately use such name instead of deployment_metadata.txt.

  • In [, -n is the default. I used -n explicitly in case the name of the reference file starts with - and could be interpreted by [ as an option. The point is not every implementation of [ is smart enough to recognize the default case early by the fact there is exactly one argument before ]. The name of your reference file is deployment_metadata.txt and it's save anyway, but if you ever want to use -whatever then with the explicit -n it should still work.

  • We need a shell to make $(…) work. There will be one sh and one find for every file that passes your original test. Creating a new process is relatively slow, so the solution may perform poorly.

  • Our -exec is enough to test what we want. This means ! -newer deployment_metadata.txt is not strictly needed. It's useful though, it improves the performance. Each time ! -newer evaluates to false, our costly -exec is not evaluated. Without this preliminary test the -exec would be evaluated for every file tested by the outer find.

  • You can add more tests/actions before our -exec or/and directly before (or instead of) -print. Our whole -exec … \; is equivalent to hypothetical -older deployment_metadata.txt you'd like to have. This is the beauty of find: with -exec you can build virtually any test.


Possible improvement

If all you want is to -print the result, the method can be optimized slightly:

find . ! -newer deployment_metadata.txt -exec sh -c '
   for f do
      [ -n "$(find deployment_metadata.txt -prune -newer "$f")" ] \
      && printf "%s\n" "$f"
   done
' find-sh {} +

In this approach one sh will serve multiple pathnames supplied by the outer find (still find can run multiple sh processes in sequence, in case all of the pathnames to be served would trigger argument list too long error, find is that smart). This way we lower the number of sh processes. We added printf, but printf is a builtin in virtually any implementation of sh and as such it runs as a part of the sh process, not a separate process.

You can still add more find-specific tests before our -exec, but not after (well, technically you can, but this won't work as expected because -exec … + always evaluates as true). Properly crafted shell code just before (or instead of) printf may work as further tests, but because it will probably involve spawning additional processes, it will defy the purpose.

Kamil Maciorowski
  • 69,815
  • 22
  • 136
  • 202