4

Since I don't have a copier or scanner, I'm using an 8 megapixel camera to copy documents. This works pretty well except they need a lot of processing afterward. I'd like to get from a photo to a bitmap, but using

djpeg -grayscale -pnm photo.jpg |
pgmtopbm -threshold -value XXX

does not work so well, for two reasons:

  1. It's hard to guess what XXX should be, and XXX is different for different photos.

  2. Illumination varies, and sometimes a single threshold isn't what's right for the image.

How can I do better? The ideal solution will be fully automatic command-line program that I can run on Linux. (I have already written a program to remove dark pixels from the edges of images.)

NOTE: I really want a bitmap, that's just black and white pixels. No grayscale, no dithering.

Norman Ramsey
  • 2,843
  • 8
  • 31
  • 41
  • Similar question here: http://superuser.com/questions/107313/software-to-clean-up-photos-of-whiteboards-and-documents/ – Simon E. Aug 02 '10 at 04:02
  • http://unix.stackexchange.com/questions/108613/how-do-you-binarize-a-colored-image || Requiring ImageMagick: http://superuser.com/questions/405686/how-to-convert-a-photo-to-a-black-and-white-image-by-imagemagick – Ciro Santilli OurBigBook.com Sep 26 '15 at 08:21

5 Answers5

4

-monochrome

This option uses some smart dithering and generates very visible output:

convert -monochrome in.png out.png

Documentation: http://www.imagemagick.org/Usage/quantize/#monochrome

Compare that to a simpler -threshold 50 transform:

convert -threshold 50 in.png out.png

which loses most of the image.

Concrete example from: https://www.nasa.gov/mission_pages/galex/pia15416.html

wget -O orig.jpg http://www.nasa.gov/images/content/650137main_pia15416b-43_full.jpg
# Downsize to 400 height to have a reasonable file size for upload here.
convert orig.jpg -resize x400 in.jpg
convert -monochrome in.jpg out.jpg
convert -threshold 50 in.jpg threshold-50.jpg

in.jpg

enter image description here

out.jpg

enter image description here

threshold-50.jpg

enter image description here

Related questions:

Tested in Ubuntu 19.10, ImageMagick 6.9.10.

3

The best thing I've found in three years is the mkbitmap program that ships with potrace.

Norman Ramsey
  • 2,843
  • 8
  • 31
  • 41
0

Apparently, Gimp supports some command-line batch processing. You might be able to give that a shot, since desaturating will probably behave like you'd expect with varying brightness in your images.

keithjgrant
  • 101
  • 3
0

Check out your camera. Many modern digital cameras have the ability to take B&W photos directly.

Gcoupe
  • 436
  • 4
  • 5
0

Converting to grayscale / desaturating will preserve most of the noise too. The GIMP has a Threshold filter (under the Color menu) that eliminates the noise, and works very well for line-art and plain black scanned text.

I'm not too clued up on the batch scripting myself, but it sounds like a good idea to use the Threshold with it.

Edit: Since you have Linux as a tag, have a look at Phatch, batch photo manipulations. It has filters to adjust the contrast and brightness too. It's in the Ubuntu repos (if you use that distro)

invert
  • 4,996
  • 2
  • 22
  • 32
  • OK, I checked out Threshold, and it does exactly what `pgmtopbm` does. If I wanted to adjust each page by hand, it would be great, but I really don't. At it completely doesn't solve the problem that I really need different thresholds in different parts of the image. Still, yours was the answer that most closely identified what GIMP can and can't do, so +1. P.S. It took me several *minutes* to find the thing among the goddamned menus. – Norman Ramsey Nov 27 '09 at 22:48
  • Apart from eyeballing the image, I can't say how to calculate the threshold values per image. Wow I'm stumped. Perhaps auto-adjusting the light levels first will put all images on the 'same level', and a common threshold value will then work? – invert Nov 30 '09 at 11:11