10

Currently when converting the PDF from given URL, like:

wkhtmltopdf http://superuser.com/ superuser.pdf

where it consist multiple A4 pages (so sometimes images are cut in the middle of the page).

My question is:

How do I convert HTML page into PDF format where the height is endless?

My preference is that you still should have editable/searchable document, not just an static image.


What I've tried already:

pelms
  • 9,251
  • 12
  • 58
  • 77
kenorb
  • 24,736
  • 27
  • 129
  • 199

3 Answers3

3

The wkhtmltopdf 0.9.6 Manual documents this parameter :

    --page-height      <unitreal>      Page height (default unit millimeter)

Therefore defining an enormously long page can be done via :

--page-height 10000cm
--page-height 100m

which both define a page of height of 100 meters (I mention both in case your wkhtmltopdf does not support m).

Not using wkhtmltopdf, I do not know if there is some upper limit to page-height, but you can find that empirically.

In addition you can try to add --disable-smart-width (width is not an error) and try for the moment the less ambitious --page-height 100cm.

In this man page there is a comment about disable-smart-width that it's only available using patched QT.

There is also another comment:

On the wkhtmltopdf website you can download a static version of wkhtmltopdf at http://code.google.com/p/wkhtmltopdf/downloads/list. This static binary will work on most systems and comes with a build in patched QT.

The project has moved elsewhere, so you might hunt there for such a version, or ask in the forums.

kenorb
  • 24,736
  • 27
  • 129
  • 199
harrymc
  • 455,459
  • 31
  • 526
  • 924
  • Tested: `wkhtmltopdf http://superuser.com/ superuser.pdf --page-height 100m`, but doesn't seems to work as expected as I still see 3 separate pages. Tested with v0.12.2.1. – kenorb Oct 04 '15 at 12:40
  • Try to add `--disable-smart-width` (width is not an error) and try for the moment the less ambitious `--page-height 100cm`. – harrymc Oct 06 '15 at 05:46
  • I've already tried, but it's saying: `Unknown long argument --disable-smart-width`. – kenorb Oct 06 '15 at 08:32
  • This then relates to another version of wkhtmltopdf. My last suggestion is to test the [beta version](http://wkhtmltopdf.org/downloads.html) including seeing if some pertinent invocation parameter was added (and especially disabling anything with "smart"). If no go, ask in [wkhtmltopdf Spport](http://wkhtmltopdf.org/support.html). – harrymc Oct 06 '15 at 09:58
  • Tested with the latest dev version from the sources (`0.12.3-dev-8f03630`), the option still doesn't work on OSX (saying it's unknown). The only reference for this option is in [`imagearguments.cc`](https://github.com/wkhtmltopdf/wkhtmltopdf/blob/7f74d893635d275f9450bf5f12c454a9c27672bd/src/image/imagearguments.cc). Maybe it works only for the images (not PDF)? – kenorb Oct 06 '15 at 16:39
  • @kenorb I find that `--page-height` only works after specifying `--page-width` (version 0.12.6) – JellicleCat Mar 24 '21 at 02:03
  • if someone reads this old question, there is feature request as old as the question. It is still not supported read https://github.com/wkhtmltopdf/wkhtmltopdf/issues/4299 – Stefan Dec 30 '22 at 15:40
0

You should do it as below:

$ wkhtmltoimage http://superuser.com/ superuser.png
loaded the Generic plugin 
Loading page (1/2)
Rendering (2/2)                                                    
Warning: Received createRequest signal on a disposed ResourceObject's NetworkAccessManager. This might be an indication of an iframe taking too long to load.
Done                                                               
$ geo=$(file superuser.png | awk '{print $5"x"$7}' | sed -e 's/,//')
$ convert superuser.png -page $geo superuser.pdf

Convert is performed from the ImageMagick package. The disadvantage of using the method above is that the PDF output would have a static image.

kenorb
  • 24,736
  • 27
  • 129
  • 199
Wayne Walker
  • 458
  • 4
  • 7
0

Looking again inside the code it seems you still cannot select an infinite roll.

So you can simply use the imagemagick command mogrify with the option -append on the downloaded file (or even +append if you want to append them horizontally).

wkhtmltopdf http://superuser.com/ superuser.pdf
mogrify -append superuser.pdf

from man mogrify

-append
append an image sequence top to botto (use +append for left to right)

If you want to create a new file you can use from the same suite convert.

convert -density 200 superuser.pdf -append superuser.vertical.pdf
kenorb
  • 24,736
  • 27
  • 129
  • 199
Hastur
  • 18,764
  • 9
  • 52
  • 95
  • I've tried, but the output PDF contain static low-quality image. I've also tried adding: `-units PixelsPerInch -density 300` or `-units PixelsPerInch -resample 300`, but the output PDF is still a low-quality image. – kenorb Oct 04 '15 at 13:01
  • The `convert` command is really sensible to the position of the parameters. You can try `convert -density 300 supersuser.pdf -append su.vertical.pdf`... or even more hight density values, and parameters. (Let me know) – Hastur Oct 04 '15 at 13:29