81

I have a scanned course and it has two pages, consecutive are showing as one page, how can I automatically split all the pages in one pass. Usually this is done by cropping odd and even pages and then merge them back together but this could take very much?

How can I split pages on scanned PDF in a single pass?

Eduard Florinescu
  • 2,946
  • 7
  • 34
  • 45

6 Answers6

55

There's an excellent, free and open source tool called Briss. It is very simple, user friendly and effective. It works on multiple operating systems through Java.

Load your PDF into the app. The app will group similar pages together and lay them on top of each other. Draw rectangles on top of your pages so that they cover what you want included. It will look like this:

enter image description here

Even if your PDF has multiple categories of layout within a single document, Briss will handle it. For example, let's say some parts are in portrait and others in landscape. Briss will group them into different categories and let you draw different rectangles on them, and then process it all in a single pass, into a single document. Briss is very good at deciding which pages should be grouped together. It typically takes me less then a minute of manual work to get Briss started. Thus, a document of hundreds or even thousands of pages can be done in a couple of minutes thanks to this brilliant program.

When it looks good, select Action, then Crop PDF.

Truly a very neat tool.

Note: I realize this answer reads like I'm a Briss developer or something, but I'm really not. I just love the tool.

Fiksdal
  • 815
  • 1
  • 12
  • 18
50

You could use MuPDF's mutool:

mutool poster -x 2 in.pdf out.pdf
DG'
  • 739
  • 6
  • 5
  • 2
    DUDE! This is the best answer hands-down. It was instantaneous to split a document with 24 two-page spreads into 48 pages perfectly! – jfacemyer Aug 15 '20 at 12:43
  • 2
    Good answer, simple, fast and quality. One comment: my scan results had pages one at right, one at left. The command to split with a vertical cut has been mutool poster -y 2 in.pdf out.pdf – pasaba por aqui Nov 22 '20 at 11:49
  • I used the same thing but split pages horizontally with `mutool poster yx 2 in.pdf out.pdf`. Very useful tool. – Joe Sadoski Aug 29 '21 at 18:38
  • I've installed like 10+ different dependencies and the `make` output still ask me for more C dependencies that I cannot find... would make the project more popular if you publish a binary version in the download page, I just want to split a document, not a contributor of the project. – Mariano Ruiz Jul 15 '22 at 12:45
  • 1
    @MarianoRuiz - Might it be possible that you mistook me for maintainer of `mupdf`, who I am not? Anyway, there are precompiled binaries for many operating systems, just check your package manager, e.g. `sudo apt-get install mupdf-tools` on ubuntu or `brew install mupdf-tools` on macOS – DG' Jul 15 '22 at 20:31
  • 1
    Wow, that thing is fast. I was at first thinking that failed since it was about instantaneous on a 300MB PDF. Nope, it worked perfectly. – Michael Buckley Sep 06 '22 at 14:18
  • Excellent. Just, too bad the right page comes before the left one with this method (Japanese reading) – brahmin Sep 19 '22 at 21:42
  • @brahmin -- You could use the `shuffle` operation of [`pdftk`](https://gitlab.com/pdftk-java/pdftk) for that – DG' Mar 03 '23 at 07:43
35

After looking on some internet answers (it is a question often asked ) I discovered that this can be done easily using the Poster option from the Print menu.

Steps (for Adobe Acrobat XI):

  1. Choose Print from File Menu or Ctrl+P
  2. Select Printer as Adobe PDF
  3. Select Poster tab.
  4. Change Overlap to 0 inch
  5. Adjust the Tile scale to your needs, 100%(99%) if the result printed PDF page size is same as the current PDF page size, 75% if the printed pdf page size is half the current pdf. Tinker with "Tile scale" percents if necessary to obtain your desired result. To check the the result printed PDF page size go right of "Adobe PDF" combo-box select Properties and change Adobe PDF Page Sizecombo-box if necessary.
  6. You can hit Print button when the page looks split like you desired, check the dotted line in the guiding preview:

enter image description here

Here is a print screen for the described settings:

enter image description here

robertspierre
  • 281
  • 1
  • 4
  • 16
Eduard Florinescu
  • 2,946
  • 7
  • 34
  • 45
  • 1
    Note that this very good method will work only on Windows, as on Mac, it is no longer possible to "print" to PS/PDF using the print dialog (reason: Apple changed something in OSX which suppresses the previously used workflow in Acrobat). – Max Wyss Aug 09 '14 at 07:16
  • 3
    There is a workaround, but it is definitely not for the faint at heart, and "should not be done at home" (print to a non-connected PostScript printer, and then snitch the spool file, and feed that into Distiller). – Max Wyss Aug 09 '14 at 09:01
  • 1
    @MaxWyss Could you give more details about the workaround? – Jairo Bochi Aug 05 '15 at 21:36
  • @JairoBochi I don't have a mac to test, but this should help: http://www.adobe.com/support/downloads/product.jsp?product=44&platform=Macintosh https://helpx.adobe.com/acrobat/using/creating-pdfs-acrobat-distiller.html – Eduard Florinescu Aug 05 '15 at 21:53
  • 2
    @JairoBochi: As described, you create a generic PDF printer (in the Printer and Scanner System Preferences), select it, and "print" to that printer. In /var/spool/ you find the spool files, which you can then snitch. You need to be su to access those files. Note: for Windows, it is not necessary, because the AdobePS printer driver still works properly. – Max Wyss Aug 05 '15 at 22:13
  • I don't see the dotted line in my preview. How did you get those lines? – tvk Feb 17 '16 at 13:45
  • @FangJing Try tune the tile scale – Eduard Florinescu Feb 17 '16 at 13:47
  • 1
    I wasn't able to create a "generic PDF printer" on Mac OS as you described, but I was able to print to my regular printer and then pause it before anything actually printed. I found a large recently created file in /var/spool/cups and it was my document. Thanks! – bugloaf Jun 11 '16 at 05:22
27

Sejda.com can split scanned PDF documents in half, down the middle. Works on all desktop platforms.

Here's a short how to:

How to split scanned PDF documents in half with Sejda.com

If it's a booklet scan and the pages are not in their natural order anymore it can reorder them for you too.

I'm a developer on the project. Open source.

Edi
  • 756
  • 5
  • 9
  • 1
    Behtareen! Best app I have ever used for PDF tasks... – Saad Rehman Shah Jun 25 '17 at 00:12
  • only one complaint - it does not allow cutting out some borders like Briss – akostadinov Apr 23 '19 at 15:39
  • 1
    You can cut out borders in a second pass using the crop tool: https://www.sejda.com/crop-pdf – Edi Apr 24 '19 at 09:09
  • Thank you! BTW, for those wondering, this works even without a booklet scan structure, where it's just one page that's A3 that you want to split into A4, with consecutive page numbers. – Sanoo Jun 15 '20 at 16:57
  • Whoa! Seriously impressive. It stacked all of the scans with low transparency, asked me where to 'chop' the left/right sides, and then split them into individual pages. Turned an annoying task into something trivial. I'm bookmarking this site for sure. – Dennis Wurster Apr 07 '21 at 17:06
  • There are limits to this tool to force you to pay. 100pages, 50MB, 3 tasks per day. etc – Matt Sephton Jul 28 '21 at 22:13
  • Not a good idea to upload your PDFs to some random site. And not necessary for this very simple task. – robertspierre Jul 24 '23 at 06:13
5

There are two problem with automating splitting scanned books in a single pass:

  • Automation is not always accuracy
  • Making a scanned book comfortably read is more than just splitting pages

For everything related with scanned books, I highly recommend using ScanTailor Advanced. It has features such as:

  • Turn skewed pages vertically,
  • Select content to reduce the page size,
  • Increase/decrease margin (for notetaking, maybe),
  • Whiten the result for better reading experience.

You must export the PDF into images to use this, and recombine the output images back. The processed images may be very small in file size (up to only 6% of the origin), but excellent in quality.

From its original GitHub repo:

Scan Tailor is Free Software (which is more than just freeware). It’s written in C++ with Qt and released under the General Public License version 3. We develop both Windows and GNU/Linux versions.

Other tips

To complete the task satisfactorily, I recommend you to use PDF-Xchange Viewer for extracting images and adding OCR, i2pdf for merging the outputs. In my experience, you can set the JPG quality to the lowest and it doesn't seem much different, but you have a trade-off between the final output's size and image quality. All programs are free. The whole process takes around 1 hour in background, with occasional checks.

I also have a complete guide to process scanned books, you may want to check it out: The ultimate guide to process scanned books.


FYI: How to create hierarchical bookmarks on scanned PDF files?

Ooker
  • 1,919
  • 2
  • 22
  • 46
  • ScanTailor has been abandoned and is no longer maintained. – robertspierre Jul 24 '23 at 06:12
  • @robertspierre you are right. At first I thought there is still ScanTailor Advanced, but [it's also abandoned as well](https://github.com/4lex4/scantailor-advanced/issues/170). However in the thread there is a link to [a new fork](https://github.com/ScanTailor-Advanced/scantailor-advanced) that is active last week – Ooker Jul 24 '23 at 07:19
0

The free (as in freedom) pdfarranger can do that.

Just select the pages you want to split, right click and select "Split pages":

enter image description here

enter image description here

robertspierre
  • 281
  • 1
  • 4
  • 16