PDF

PDF Split Pages Are Corrupted: How to Fix and Re-split Properly

A PDF split that produced corrupted output pages typically points to one of two causes: the source PDF was already damaged before splitting, or the split tool produced invalid output. Both are fixable, though by different paths. This guide walks through the diagnostic sequence and the fix for each scenario.

Quick fix

If you still have the source PDF, re-split it using qpdf. qpdf preserves all structural elements and produces clean output files.

To extract page 1 to its own file:

qpdf input.pdf --pages . 1 -- page1.pdf

To extract a range, for example pages 5 through 10:

qpdf input.pdf --pages . 5-10 -- pages5-10.pdf

To split the entire PDF into individual pages, one file per page:

qpdf --split-pages=1 input.pdf out_%d.pdf

The %d placeholder is replaced with the page number in each output file. See the complete guide to qpdf for the full page-selection syntax including ranges, mixed selections, and reverse-counted pages.

If the source PDF still produces corrupted pages even after qpdf splitting, the source itself is damaged — the next section addresses that.

If that didn’t work

If qpdf’s split output is also corrupted, the source PDF is the problem. Repair it first, then re-split:

qpdf --linearize source.pdf source-repaired.pdf
qpdf source-repaired.pdf --pages . 1-5 -- pages1-5.pdf

The first command rebuilds the cross-reference table of the source file. The second splits the repaired version. If the original source had structural damage that the splitting was inheriting, the repaired version typically splits cleanly.

If you no longer have the source and only the corrupted split files remain, attempt to repair each one individually:

qpdf --linearize broken-page1.pdf repaired-page1.pdf
qpdf --linearize broken-page2.pdf repaired-page2.pdf

How well this works depends on what was wrong with the original split. If only the xref tables are damaged, qpdf usually rebuilds them successfully. If the page content streams themselves are broken, qpdf will produce a structurally-valid file with broken visual content.

Advanced recovery

When qpdf can’t recover a split file, Ghostscript re-renders it from whatever can be parsed:

gs -sDEVICE=pdfwrite -dPDFSETTINGS=/prepress -dNOPAUSE -dQUIET -dBATCH -sOutputFile=output.pdf broken-page.pdf

For single-page extracted documents this often works well, since there are no cross-references between pages to worry about. For multi-page split outputs, Ghostscript treats them as their own documents, so the result is the same as for any other Ghostscript repair: visible content preserved, form fields and annotations and signatures lost.

Why this happens

Splitting a PDF — extracting pages or pulling apart a multi-page document — looks simple but requires the splitting tool to do real structural work. Each split output needs a complete object graph, fresh cross-reference tables, and properly-rebuilt resource dictionaries. Tools that take shortcuts produce invalid files.

The most common causes:

Online split tools producing malformed xref tables. Many free online services append objects to a stripped-down template file rather than properly building a new PDF structure. The result is files that look like PDFs but have inconsistent or missing xref entries. Acrobat opens them anyway through its silent recovery; stricter readers refuse them.

The source PDF was already damaged. A source file with subtly broken structure — mismatched xref entries, malformed trailer, partial corruption — may still display normally in a tolerant reader, but splitting it amplifies the damage. Each split output inherits the original problems and adds new ones.

Resource dictionaries not properly carried forward. If pages reference shared fonts, images, or other resources via the source PDF’s resource dictionary, the splitter has to copy the relevant resources into each output’s own dictionary. Tools that miss this produce split files that open with missing fonts, broken images, or rendering errors.

Encrypted source PDFs that the tool can’t fully decrypt. Some online tools accept encrypted PDFs but mishandle the encryption when writing the split output. The result is files that won’t open even with the correct password — the encryption was scrambled rather than transferred.

Truncated output during the split operation. Browser-based split tools can fail partway through, especially for large source files. The split outputs that did get downloaded may be incomplete.

Preventing this in future

The same pattern as PDF merging applies: for important files, use local tools, not online services.

The reliable options:

qpdf — free, scriptable, lossless. The first choice for any PDF splitting or extraction job.

pikepdf — Python wrapper around qpdf. The choice for batch workflows or automation.

Adobe Acrobat — paid, GUI-based. Tools > Organize Pages > Extract does proper extraction with full structural fidelity.

PDF-XChange Editor — free for basic features, paid for advanced. Reliable Windows-based GUI option.

For one-off splits where the file isn’t critical, online tools are usable. The cost of the occasional corrupted output is low enough to absorb. For anything important, take the ten minutes to install qpdf or learn Acrobat’s extract feature; it pays back across every PDF workflow afterwards.

PDF merging produces corrupted output through the same root causes — see merged PDF file is corrupted for the merge-specific fix path. If your source PDF itself is showing signs of damage, repair it first using the strategies in the PDF repair pillar. And if you’re regularly working with split outputs from a specific source — for example, splitting reports generated by an upstream system — the source’s generation pipeline may be the underlying issue, worth raising with whoever maintains it.

Last verified: April 2026