1

So, I am not sure how to do grep when I have 500k json files? It was working when I had 200k but seems now I have too many files. Either grep or anything that can do the task?

[jalal@ivcgpu1 tweets]$ grep -wirnE 'Wed Oct 19 2(1:[0-5][0-9]:[0-5][0-9]|2:([0-2][0-9]:[0-5][0-9]|30:00)) .* 2016' *
-bash: /usr/bin/grep: Argument list too long
[jalal@ivcgpu1 tweets]$ ls -1 | wc -l
554472
Mona Jalal
  • 4,299
  • 20
  • 64
  • 96

1 Answers1

5

When you use asterisks on the command line they are expanded by your shell before being passed to the application. If that asterisk expands to 100+ files then you're actually passing 100+ arguments to the application. It's not a problem to pass quite a lot of arguments, but your bash has a limit of 500,000.

Since you're already using -r (recursive) is it possible to rewrite the call to grep to only specify the directory you want to search in?

# recursive
grep -R <options> <pattern> <directory>

For instance in your case you could go:

grep -wirnE \
  'Wed Oct 19 2(1:[0-5][0-9]:[0-5][0-9]|2:([0-2][0-9]:[0-5][0-9]|30:00)) .* 2016' .

(* changed to .).

That way, instead of grep being handed a list of hundreds of thousands of files, it's just given one directory, and it uses its recursive processing to find the files itself.

thomasrutter
  • 36,068
  • 10
  • 86
  • 105
  • can you please have a look at this https://askubuntu.com/questions/996335/searching-for-specialized-patterns-using-grep-in-a-json-file – Mona Jalal Jan 15 '18 at 23:27