99

I'm looking for a shell one-liner to find the oldest file in a directory tree.

Marius Gedminas
  • 2,240
  • 1
  • 16
  • 15
  • By combining answers from this an other questions, I am using this: "ls -ltd $(find . -type f) | tail -1" – meolic Oct 01 '20 at 13:13

8 Answers8

96

This works (updated to incorporate Daniel Andersson's suggestion):

find -type f -printf '%T+ %p\n' | sort | head -n 1
Marius Gedminas
  • 2,240
  • 1
  • 16
  • 15
  • 10
    Less typing: `find -type f -printf '%T+ %p\n' | sort | head -1` – Daniel Andersson Feb 15 '13 at 19:37
  • 1
    I get empty space becasue my first line from this `find` is empty due to the fact i have filename contains newline. – 林果皞 Apr 19 '16 at 12:50
  • 1
    Can I ask if this uses the created or modification date? – MrMesees Nov 27 '16 at 13:07
  • 1
    Linux doesn't store the file creation date anywhere[\*]. This uses the modification date. [\*] this is actually not true; ext4 stores the inode creation date, but it's not exposed via any system calls and you need to use debugfs to see it.) – Marius Gedminas Nov 28 '16 at 07:14
  • I get a useless answer with this one with the head limit set to 1 because a whole bunch of flatpak files have the modification date set to zero, so I'm getting tens of thousands of files with Jan 1, 1970. Then I get a few thousand more for 1980-01-01 from google cloud sdk and docker. Then thousands more from 1985-10-26 from nodejs. Then a bunch from over a decade ago from various git repos and steam files (cloning source/build files takes remote timestamps). Then I get my actual files around line 150k – theferrit32 Jan 03 '20 at 22:13
22

This one's a little more portable and because it doesn't rely on the GNU find extension -printf, so it works on BSD / OS X as well:

find . -type f -print0 | xargs -0 ls -ltr | head -n 1

The only downside here is that it's somewhat limited to the size of ARG_MAX (which should be irrelevant for most newer kernels). So, if there are more than getconf ARG_MAX characters returned (262,144 on my system), it doesn't give you the correct result. It's also not POSIX-compliant because -print0 and xargs -0 isn't.

Some more solutions to this problem are outlined here: How can I find the latest (newest, earliest, oldest) file in a directory? – Greg's Wiki

slhck
  • 223,558
  • 70
  • 607
  • 592
  • 1
    This works too, but it also emits an `xargs: ls: terminated by signal 13` error as a side effect. I'm guessing that's SIGPIPE. I've no idea why I don't get a similar error when I pipe sort's output to head in my solution. – Marius Gedminas Feb 15 '13 at 16:29
  • Your version is also easier to type from memory. :-) – Marius Gedminas Feb 15 '13 at 16:29
  • Yes, that's a broken pipe. I don't get this with both GNU and BSD versions of all those commands, but it's the `head` command that quits once it has read a line and thus "breaks" the pipe, I think. You don't get the error because `sort` doesn't seem to complain about it, but `ls` does in the other case. – slhck Feb 15 '13 at 16:32
  • 5
    This breaks if there are so many filenames that `xargs` needs to invoke `ls` more than once. In that case, the sorted outputs of those multiple invocations end up concatenated when they should be merged. – Nicole Hamilton Feb 15 '13 at 17:00
  • @Nicole You're right, and I was implying this with my hint to `ARG_MAX`, because this is the number of files that can be passed, e.g. 262144 on my OS X. (Maybe I should have been more explicit on this?) – slhck Feb 15 '13 at 17:09
  • 3
    I think this is worse than posting a script that assumes filenames never contain spaces. A lot of the time, those will work because the filenames don't have spaces. And when they fail, you get an error. But this is unlikely to work in real cases and failure will go undiscovered. On any directory tree big enough that you can't just `ls` it and eyeball the oldest file, your solution probably _will_ overrun the command line length limit, causing `ls` to be invoked multiple times. You'll get the wrong answer but you'll never know. – Nicole Hamilton Feb 15 '13 at 17:15
  • Well, if in the real case there are less than `ARG_MAX` files, it works. If not, then not, and I explicitly mentioned this as a drawback when posting the answer. I made that point a little clearer though. (Also upvoted your comment.) The main reason I posted this is that `printf` doesn't exist in non-GNU `find` and therefore an alternative is needed. – slhck Feb 15 '13 at 17:22
  • 1
    If it even failed with an error, I would be okay with this. But it fails silently. That's unacceptable. – Nicole Hamilton Feb 15 '13 at 17:27
  • It's also worth clarifying that this fails if the sum of the lengths of the filenames _in characters_ is greater than `ARG_MAX`, not if there are more than `ARG_MAX` files. – Nicole Hamilton Feb 15 '13 at 18:00
  • @Nicole You're right, I missed that. – slhck Feb 15 '13 at 19:21
  • I just found out that I never fully understood how xargs works. *ARG_MAX* is 2,097,152 on Ubuntu 12.10, by the way. – Dennis Feb 16 '13 at 04:12
  • On reflection, I don't actually want to seem harsh, so I'm undoing my downvote. But I do think it's important that software should never be designed to fail silently. – Nicole Hamilton Feb 17 '13 at 18:11
  • 1
    And an even more portable approach is `find . -type f -exec ls -ltr {} + | head -n1`. No need in `xargs` and `-print0`. This is [supported in POSIX](http://pubs.opengroup.org/onlinepubs/9699919799//utilities/find.html), unlike `-print0`. – Ruslan Sep 09 '18 at 17:30
12

The following commands commands are guaranteed to work with any kind of strange file names:

find -type f -printf "%T+ %p\0" | sort -z | grep -zom 1 ".*" | cat

find -type f -printf "%T@ %T+ %p\0" | \
    sort -nz | grep -zom 1 ".*" | sed 's/[^ ]* //'

stat -c "%y %n" "$(find -type f -printf "%T@ %p\0" | \
    sort -nz | grep -zom 1 ".*" | sed 's/[^ ]* //')"

Using a null byte (\0) instead of a linefeed character (\n) makes sure the output of find will still be understandable in case one of the file names contains a linefeed character.

The -z switch makes both sort and grep interpret only null bytes as end-of-line characters. Since there's no such switch for head, we use grep -m 1 instead (only one occurrence).

The commands are ordered by execution time (measured on my machine).

  • The first command will be the slowest since it has to convert every file's mtime into a human readable format first and then sort those strings. Piping to cat avoids coloring the output.

  • The second command is slightly faster. While it still performs the date conversion, numerically sorting (sort -n) the seconds elapsed since Unix epoch is a little quicker. sed deletes the seconds since Unix epoch.

  • The last command does no conversion at all and should be significantly faster than the first two. The find command itself will not display the mtime of the oldest file, so stat is needed.

Related man pages: findgrepsedsortstat

Dennis
  • 48,917
  • 12
  • 130
  • 149
7

Although the accepted answer and others here do the job, if you have a very large tree, all of them will sort the whole bunch of files.

Better would be if we could just list them and keep track of the oldest, without the need to sort at all.

That's why I came up with this alternative solution:

ls -lRU $PWD/* | awk 'BEGIN {cont=0; oldd=strftime("%Y%m%d"); } { gsub(/-/,"",$6); if (substr($1,0,1)=="/") { pat=substr($1,0,length($0)-1)"/"; }; if( $6 != "") {if ( $6 < oldd ) { oldd=$6; oldf=pat$8; }; print $6, pat$8; count++;}} END { print "Oldest date: ", oldd, "\nFile:", oldf, "\nTotal compared: ", count}'

I hope it might be of any help, even if the question is a bit old.


Edit 1: this changes allow parsing files and directories with spaces. It is fast enough to issue it in the root / and find the oldest file ever.

ls -lRU --time-style=long-iso "$PWD"/* | awk 'BEGIN {cont=0; oldd=strftime("%Y%m%d"); } { gsub(/-/,"",$6); if (substr($0,0,1)=="/") { pat=substr($0,0,length($0)-1)"/"; $6="" }; if( $6 ~ /^[0-9]+$/) {if ( $6 < oldd ) { oldd=$6; oldf=$8; for(i=9; i<=NF; i++) oldf=oldf $i; oldf=pat oldf; }; count++;}} END { print "Oldest date: ", oldd, "\nFile:", oldf, "\nTotal compared: ", count}'

Command explained:

  • ls -lRU --time-style=long-iso "$PWD"/* lists all files (*), long format (l), recursively (R), without sorting (U) to be fast, and pipe it to awk
  • Awk then BEGIN by zeroing the counter (optional to this question) and setting the oldest date oldd to be today, format YearMonthDay.
  • The main loop first
    • Grabs the 6th field, the date, format Year-Month-Day, and change it to YearMonthDay (if your ls doesn't output this way, you may need to fine tune it).
    • Using recursive, there will be header lines for all directories, in the form of /directory/here:. Grab this line into pat variable. (substituting the last ":" to a "/"). And sets $6 to nothing to avoid using the header line as a valid file line.
    • if field $6 has a valid number, it's a date. Compare it with the old date oldd.
    • Is it older? Then save the new values for old date oldd and old filename oldf. BTW, oldf is not only 8th field, but from 8th to the end. That's why a loop to concatenate from 8th to the NF (end).
    • Count advances by one
    • END by printing the result

Running it:

$ time ls -lRU "$PWD"/* | awk ... etc.

Oldest date:  19691231
File: /home/.../.../backupold/.../EXAMPLES/how-to-program.txt
Total compared:  111438
real    0m1.135s
user    0m0.872s
sys     0m0.760s

EDIT 2: Same concept, better solution using find to look at the access time (use %T with the first printf for modification time or %C for status change instead).

find . -wholename "*" -type f -printf "%AY%Am%Ad %h/%f\n" | awk 'BEGIN {cont=0; oldd=strftime("%Y%m%d"); } { if ($1 < oldd) { oldd=$1; oldf=$2; for(i=3; i<=NF; i++) oldf=oldf " " $i; }; count++; } END { print "Oldest date: ", oldd, "\nFile:", oldf, "\nTotal compared: ", count}'

EDIT 3: The command below uses modification time and also prints incremental progress as it finds older and older files, which is useful when you have some incorrect timestamps (like 1970-01-01):

find . -wholename "*" -type f -printf "%TY%Tm%Td %h/%f\n" | awk 'BEGIN {cont=0; oldd=strftime("%Y%m%d"); } { if ($1 < oldd) { oldd=$1; oldf=$2; for(i=3; i<=NF; i++) oldf=oldf " " $i; print oldd " " oldf; }; count++; } END { print "Oldest date: ", oldd, "\nFile:", oldf, "\nTotal compared: ", count}'
Pablo A
  • 1,470
  • 13
  • 21
DrBeco
  • 1,915
  • 2
  • 17
  • 16
  • It still needs tweeking to accept files with spaces. I'll do that soon. – DrBeco Jun 19 '15 at 15:50
  • I think parsing ls for files with spaces isn't a good idea. Maybe using find. – DrBeco Jun 19 '15 at 16:35
  • Just run it in the entire tree "/". Time spent: Total compared: 585744 real 2m14.017s user 0m8.181s sys 0m8.473s – DrBeco Jun 19 '15 at 18:26
  • Using `ls` is bad for scripting as its output is not meant for machines, output formatting varies across implementations. As you already stated `find` is good for scripting but it might also be good to add that info before telling about `ls` solutions. – Sampo Sarrala - codidact.org Jan 06 '16 at 12:34
4

Please use ls - the man page tells you how to order the directory.

ls -clt | head -n 2

The -n 2 is so you dont get the "total" in the output. If you only want the name of the file.

ls -t | head -n 1

And if you need the list in the normal order (getting the newest file)

ls -tr | head -n 1

Much easier than using find, much faster, and more robust - dont have to worry about file naming formats. It should work on nearly all systems too.

user1363990
  • 165
  • 2
  • 7
    This works only if the files are in a single directory, while my question was about a directory tree. – Marius Gedminas Sep 02 '14 at 06:02
  • Add -Ral to ls command. Theres a number of combinations that work well. And provides very simple and clean output. See ls man page (as mentioned) – user1363990 Nov 25 '20 at 03:48
2
find ! -type d -printf "%T@ %p\n" | sort -n | head -n1
Dennis
  • 48,917
  • 12
  • 130
  • 149
Okki
  • 21
  • 1
  • This won't work properly if there are files older than 9 Sep 2001 (1000000000 seconds since Unix epoch). To enable numeric sorting, use `sort -n`. – Dennis Feb 16 '13 at 03:11
  • This helps find me the file, but it's hard to see how old it is without running a second command :) – Marius Gedminas Feb 16 '13 at 09:38
0

It seems that by "oldest" most people have assumed that you meant "oldest modification time." That's probably corrected, according to the most strict interpretation of "oldest", but in case you wanted the one with the oldest access time, I would modify the best answer thus:

find -type f -printf '%A+ %p\n' | sort | head -n 1

Notice the %A+.

PenguinLust
  • 125
  • 1
  • 9
-1
set $(find /search/dirname -type f -printf '%T+ %h/%f\n' | sort | head -n 1) && echo $2
  • find ./search/dirname -type f -printf '%T+ %h/%f\n' prints dates and file names in two columns.
  • sort | head -n1 keeps the line corresponding to the oldest file.
  • echo $2 displays the second column, i.e. the file name.
Dmitry Grigoryev
  • 9,151
  • 4
  • 43
  • 77
Dima
  • 11
  • 1
  • 1
    Welcome to Super User! While this may answer the question, it would be a better answer if you could provide some explanation **why** it does so. – DavidPostill Jun 08 '15 at 13:10
  • 1
    Note, several people also asked for some explanation of your previous (identical) deleted answer. – DavidPostill Jun 08 '15 at 13:12
  • What is difficult to answer? find ./search/dirname -type f -printf '% T +% h /% f \ n' | sort | head -n 1 It shows two columns as the time and path of the file. It is necessary to remove the first column. Using set and echo $ 2 – Dima Jun 08 '15 at 13:16
  • 1
    You should provide explanations instead of just pasting a command line, as requested by several other users. – Ob1lan Jun 08 '15 at 13:20
  • 1
    How is this different then the accepted answer? – Ramhound Jun 08 '15 at 14:35
  • @Ramhound this answer caters for only keeping the file name instead of a string containing both timestamp and filename. 'Twas useful for me anyway... – Geert Jul 11 '17 at 20:02