3

I have over 100,000 .pdf files. Among them I need to find out the corrupted files.

Is there a way to get the files which are corrupted – or vice versa, get those that are working (in an automated way rather than manually examining the files one at a time)?

I searched a lot but could not find any. All the results were showing me software to fix broken PDFs.

fixer1234
  • 27,064
  • 61
  • 75
  • 116
user1917830
  • 133
  • 5

1 Answers1

1

You could use something like Ghostscript to read them all and convert them to bitmap images which are not written to a file (e.g. on Linux redirect output to /dev/null). A script could check for return codes and error messages.

RedGrittyBrick
  • 81,981
  • 20
  • 135
  • 205