7

I have about 17k files in a directory. When I run ls directory, I have to wait for about 15-20 seconds before the results are displayed. On the other hand, when I run ls directory | wc -l or ls directory | grep .xyz, the results are displayed immediately.

Why does this happen and is there a way to fix this?

tytywin
  • 113
  • 1
  • 1
  • 7

4 Answers4

11

I'm going to guess that you're using Linux.

  1. If your ls command is aliased such that it shows files & folders in colour, then it needs to find out each item's permissions (a stat() call) and whether it has any "file capabilities" set (a getxattr() call) in order to choose the right colour. Depending on file system, these calls can be fairly slow if the required metadata hasn't been cached in RAM yet. [Extended attributes often live in the data area, so each getxattr results in HDD seeks.]

    On the other hand, ls | when redirected to a pipe automatically disables colouring, so it no longer needs to do any extra checks – just a straightforward readdir() loop which returns the file name and type, and the kernel likely even implements read-ahead for that.

  2. nonsense

Use strace or perf trace to check which system calls, if any, are taking a long time.

u1686_grawity
  • 426,297
  • 64
  • 894
  • 966
  • But `ls directory | wc -l` won’t produce any output until after the `ls` completes.   And, even when writing to something other than a terminal, `ls` *sorts* the directory by default, so it still can’t output anything before it has read the entire directory. – Scott - Слава Україні Jul 28 '19 at 18:50
  • 1
    Yes, the 2nd part of the post is mostly nonsense. The main difference comes from needing to call lstat()/getxattr(). – u1686_grawity Jul 28 '19 at 19:24
0

Two things:

  1. If you run ls first and ls | wc -l later, its possible the former will read from your HDD and the latter will read cached data. If so, ls initially "stalls" and prints nothing for few seconds. Another ls will start printing almost immediately, as long as the cached data is still there. If you started with ls | wc -l in the first place, it would have to wait for HDD to supply data.
  2. Any terminal works with its own speed. Formally stty speed will show you some value, but I think it doesn't matter for a virtual terminal. Still, displaying characters and scrolling takes time (see this question). Passing the same data through a pipe is faster.
Kamil Maciorowski
  • 69,815
  • 22
  • 136
  • 202
  • You suggest that *writing to the display* is a significant factor in the timing.  This can be tested with some experiments: Run the following commands: ``ls -C directory > tmp`` (adding ``--color=always`` if `ls` is normally aliased to use `--color`), ``cat tmp``, and ``ls directory | cat``, and look at how long they take. – Scott - Слава Україні Jul 28 '19 at 18:50
  • @Scott I suggest it can be, especially if one uses like `/dev/tty2`. I have a directory with 200k files and I don't even need to test thoroughly. In `/dev/tty2` the command scrolls and scrolls and scrolls… And in `konsole` it prints in no time. Straightforward `time ls` yielded 30s and 1.4s respectively; for `ls | wc -l` this was 0,5s in both terminals. All tests were done after initial `ls` that actually obtained data from my not-so-fast HDD and allowed the OS to cache it. The results are repeatable. – Kamil Maciorowski Jul 30 '19 at 20:03
0

If you want a faster alternative to ls that only specifies what is directory and what is file then:

You can create a simple executable with name ls_fast or whatever name you prefer in ~/.local/bin with the following content:

#!/usr/bin/env python3
import os
import colorama
import sys

if len(sys.argv) == 1:
    sys.argv.append(".")

dir_content = os.listdir(sys.argv[1])

for x in dir_content:
    if os.path.isdir(os.path.abspath(f"{sys.argv[1]}/{x}")):
        print(colorama.Fore.BLUE+x+colorama.Fore.RESET)
    else:
        print(x)

the output of above might not be very good looking because it will use only 2 colors that is blue for directories and white for files but the above should work way faster than ls.

Now after writing the above file change its mode to executable:

chmod +x ~/.local/bin/ls_fast

or whatever you named it. Now restart the terminal and you should have a simple command named ls_fast with you. The command does not have many features but it just works.

AmaanK
  • 103
  • 1
  • 5
-1

I think the question of why it happens has already been answered. A quick hack to get around the problem is using the command:

python -c "import os; print(os.listdir('.'))"

This doesn't do all the fancy extra stuff ls does it just prints files to the screen. For a slightly nicer to read output you could use something like:

python -c "import os; print('\n'.join(sorted(os.listdir('.'))))"
Matt Ellis
  • 11
  • 2