26

Similarly to this question:

Convert a PDF to greyscale on the command line in FLOSS?

I have a PDF-document and want to convert it to pure black and white. So I want to discard halftones. To convert to grayscale with ghostscript I can use this command:

gs \
 -sOutputFile=output.PDF \
 -sDEVICE=pdfwrite \
 -sColorConversionStrategy=Gray \
 -dProcessColorModel=/DeviceGray \
 -dCompatibilityLevel=1.4 \
  input.PDF < /dev/null

What do I have to change to get monochrome e.g. only the colors black and white and no halftones?

niklasfi
  • 581
  • 3
  • 7
  • 13

10 Answers10

8

The last suggestion indeed only converts to grayscale and then only works if the underlying doc uses setrgbcolor. This did not work for me, since I had a doc, that used setcolor.

I had success with redefining setcolor to always set the color to 0,0,0:

gs -o <output-file.pdf> -sDEVICE=pdfwrite \
-c "/osetcolor {/setcolor} bind def /setcolor {pop [0 0 0] osetcolor} def" \
-f <input-file.ps>

It has been 15+ years since I did any PostScript hacking, so the above may be lame, incorrect or even accidental - if you know how to do better, please suggest.

Surge
  • 104
  • 1
  • 2
  • It should be `{setcolor}` rather than `{/setcolor}` since PostScript uses no slash when procedures are called during bind. Other than that: Great answer – thank you. – Hermann Aug 17 '20 at 17:21
  • 1
    It did not work for me with gs 9.26. The output was in color, regardless of whether the argument was `{setcolor}` or `{/setcolor}` as per Hermann's comment above. – XavierStuvw Mar 06 '21 at 17:44
  • I concur with @XavierStuvw It seems the behavior of gs has changed since 2011. The solution by @KurtPfeifle below that converts a ps to a black-white pdf with `gs ... -c "/setrgbcolor{0 ...` still works, however. – 0range Dec 28 '21 at 19:27
  • Neither `{setcolor}` nor `{/setcolor}` worked for me. – Geremia Jul 24 '22 at 22:45
  • No longer works. – pigeonburger Nov 30 '22 at 21:46
4

I am not sure if the following suggestion will work... but it may be worth to try out:

  1. convert the PDF to PostScript using the simple pdf2ps utility
  2. convert that PostScript back to PDF while using a re-defined /setrgbcolor PostScript operator

These are the commands:

First

  pdf2ps color.pdf color.ps

This gives you color.ps as output.

Second

gs \
-o bw-from-color.pdf \
-sDEVICE=pdfwrite \
-c "/setrgbcolor{0 mul 3 1 roll 0 mul 3 1 roll 0 mul 3 1 roll 0 mul add add setgray}def" \
-f color.ps
Kurt Pfeifle
  • 12,411
  • 2
  • 54
  • 71
4

It's not ghostscript, but with imagemagick this is quite simple:

 convert -monochrome input.pdf output.pdf
o-town
  • 65
  • 1
3

I could not find out which procedure for color selection is used in the PDFs I am dealing with. This is why I convert to grayscale PostScript first:

gs -o gray.ps -sDEVICE=ps2write -sColorConversionStrategy=Gray -dProcessColorModel=/DeviceGray -dCompatibilityLevel=1.4 -f colored.pdf

As the PDFs I struggle to print may contain confidential information which is cleverly "redacted" by having the color set to white, I need to employ some sort of thresholding. This is what I came up with:

gs -o thresholded.pdf -sDEVICE=pdfwrite -c "/osetgray {setgray} bind def /setgray {0.5 lt {0} {1} ifelse osetgray} def" -f gray.ps

For those (like me) unfamiliar with PostScript's stack programming style, this re-defines setgray as:

setgray(value) {
   original_setgray(value < 0.5 ? 0 : 1)
}
Hermann
  • 223
  • 2
  • 8
1

Just wanted to chime in that this was a handy post. Been using k2pdfopt to format pdfs for kindle usage. For years used gImageReader edited the pdf with brightness and contrast and exported to a image file. The big problem was I had to manually right click for each image of the pdf which is tedious to say the least and a ton of tinkering. Anyways I found with a little trial and error that the post above was helpfull but I would definitly add thelines below, colorspace gray being very important and used with posterize seems to clear up alot of garbage. Will be using this handy command with pdfarranger if nessesary and k2pdfopt!

convert -density 300 -colorspace Gray -posterize 2 -deskew 80% input.pdf output.pdf

The only other thing is with imagemagick I had to change the policy file to read write for pdf usage. There is a ton documentation elsewhere for that - ty internet!

Phil
  • 11
  • 1
1

This looks like it would work:

1) Convert the file to monochrome with gs

gs -sDEVICE=psmono \
  -dNOPAUSE -dBATCH -dSAFER \
  -sOutputFile=combined.ps \
  first.pdf \
  second.ps \
  third.eps [...]

3) Convert the Postscript file back to a PDF with ps2pdf or gs

(credit to: http://www.linuxjournal.com/content/tech-tip-using-ghostscript-convert-and-combine-files)

Ed L
  • 141
  • 5
  • 4
    Note: Recent versions of ghostscript do no longer include the now-deprecated pswriter (and variants like psmono): https://ghostscript.com/doc/9.26/VectorDevices.htm – Hermann Aug 17 '20 at 18:08
1

for gray scale PDF:

By using GhostScript

IN PHP code, use this script

exec("'gs' '-sOutputFile=outputfilename.pdf' '-sDEVICE=pdfwrite' '-sColorConversionStrategy=Gray' '-dProcessColorModel=/DeviceGray' '-dCompatibilityLevel=1.4'  'inputfilename.pdf'",$output);

usefull url
http://www.linuxjournal.com/content/tech-tip-using-ghostscript-convert-and-combine-files

0

I had to modify the solution suggested by Surge (above) a little bit for my file:

Step 1: Convert the coloured.pdf to coloured.ps

gswin64c -dNOPAUSE -dBATCH -sDEVICE=ps2write -sOutputFile=coloured.ps coloured.pdf

Step 2: Convert the coloured.ps to blackandwhite.pdf

gswin64c ^
-dBATCH -dNOPAUSE -q ^
-sOutputFile=blackandwhite.pdf ^
-sDEVICE=pdfwrite ^
-c "/osetrgbcolor {setrgbcolor} bind def /setrgbcolor {pop pop pop 0 0 0 osetrgbcolor} def" ^
-f coloured.ps

I did not have any success with setcolor operator as suggested by Surge. So I decided to play with other operators that can set colour is postscript like setgray, setrgbcolor, setcmykcolor, etc.

What I understand is that code in quotes following -c switch is postscript. It tells to bind the original definition of setrgbcolor with a new custom operator called osetrgbcolor . Now define a new instance setrgbcolor that pops the 3 inputs expected by original setrgbcolor and replace them with 0 0 0 i.e. red=0 green=0 blue=0. Thus 0 0 0 is passed to the operator osetrgbcolor custom defined earlier

PS1: The above code was implemented in windows command prompt

PS2: I was a total stranger to Postscript coding. I got a jumpstart from youtuber "John's Basement" in the video series Postscript Tutorial. I referred Adobe's Postscript Language Reference to understand the operator setrgbcolor and operands that it accepted.

ednoy
  • 9
  • 4
0

ImageMagick can do it.

convert -posterize 2 input.pdf output.pdf

Comes out nice & crisp, and about a 3rd the file size of the color original.

DarkDiamond
  • 1,875
  • 11
  • 12
  • 19
Jerryk
  • 1
0

For pure black and white PDF, you need to convert it into ps format then into PDF for postscript:

exec(" gs -sDEVICE=psmono  -dNOPAUSE -dBATCH -dSAFER  -sOutputFile=combined.ps  $pdf");

postscript to PDF -> black and white

exec(" gs -sDEVICE=pdfwrite   -dNOPAUSE -dBATCH -dSAFER  -sOutputFile=file_pdf.pdf  filename.ps");