PDF

Merged PDF File Is Corrupted: How to Fix and Avoid It Recurring

A merged PDF that opens corrupted is almost always a tool problem, not a file problem. The component PDFs are usually fine; the merge tool wrote them together in a way that produced an invalid file. If you still have the source files, the fastest fix is to re-merge using a tool that produces clean output. If only the broken merged file remains, repair is possible but the result may be incomplete.

Quick fix

If the source PDFs are still available, re-merge them using qpdf. qpdf operates directly on PDF structure, doesn’t re-render content, and produces clean output:

qpdf --empty --pages file1.pdf file2.pdf file3.pdf -- merged.pdf

This concatenates the files in the order specified. To merge specific page ranges from each:

qpdf --empty --pages file1.pdf 1-5 file2.pdf 1-3 file3.pdf 1-10 -- merged.pdf

The result preserves form fields, annotations, bookmarks, and metadata from the source files. See the complete guide to qpdf for the full page-selection syntax.

If you don’t have command-line access, Adobe Acrobat’s Combine Files feature (Tools > Combine Files) is reliable for the same job through a GUI. Most online merge tools are not — they’re the cause of the original problem in many cases.

If that didn’t work

If you no longer have the source files and only the corrupted merged output remains, attempt structural repair with qpdf:

qpdf --linearize broken-merged.pdf repaired.pdf

This rebuilds the cross-reference table and trailer from the file’s existing object content. Whether it produces a usable result depends on what the original merge tool actually wrote — sometimes the structure is the only thing wrong, and qpdf will reconstruct a clean file. Sometimes the merge tool also wrote malformed object content, in which case qpdf’s output will still have problems.

Run qpdf with --check first if you want to see what it found:

qpdf --check broken-merged.pdf

If the report shows xref errors but the objects themselves are intact, repair is likely to succeed. If it shows object-level problems, expect the recovery to be partial.

Advanced recovery

When qpdf can’t recover the file, Ghostscript’s re-rendering approach sometimes salvages the visible content:

gs -sDEVICE=pdfwrite -dPDFSETTINGS=/prepress -dNOPAUSE -dQUIET -dBATCH -sOutputFile=output.pdf broken-merged.pdf

For merged PDFs this loses form fields, annotations, bookmarks, and any tagged structure from the original components. If the merged file was a contract package with signed signatures, the signatures are gone after Ghostscript processing — meaning you would need to redo the signing on the cleaned output, which has legal implications worth thinking through before proceeding.

Why this happens

Merging PDFs is more complex than concatenating them. A PDF is not a sequential stream — every page references shared resources (fonts, images, content streams) via numbered objects, and those references are tracked in a cross-reference table at the end of the file. A correct merge has to renumber objects from each source file, fix all the internal references, build a unified resource dictionary, and write a single consistent xref table for the combined output.

Most failures come from tools that don’t do this carefully:

Free online merge tools are the highest-volume source of corrupted merges. Many use shortcuts — appending source files directly without proper object renumbering, writing trailing xref tables that don’t match the actual object positions, or stripping critical structural data to keep the merge fast. The output usually opens in Acrobat (which auto-repairs damage silently) but fails in stricter readers.

Outdated or buggy merge tools sometimes mishandle specific source-file features. PDFs with embedded multimedia, JavaScript actions, or unusual encryption can trigger merge bugs that produce invalid output.

Conflicting incremental updates. A PDF can be modified by appending new objects and a new xref table to the end of the file, leaving the original content intact. If the merge tool combines several PDFs that each have multiple incremental update layers, the resulting file may have multiple shadow xref tables that conflict. The reader doesn’t know which one to follow and may give up.

Truncated output. Large merges can fail partway through, particularly in browser-based tools that depend on a working network connection throughout the operation. The result is a file that looks complete in size but is missing the final xref table that makes it parseable.

Preventing this in future

For occasional one-off merges of unimportant files, online tools are convenient and the failure rate is acceptable. For anything important — contracts, reports for distribution, archived records — use local tools.

The reliable options:

qpdf — free, scriptable, lossless. The default choice if you’re comfortable with the command line.

Adobe Acrobat — paid, GUI-based, lossless. The default choice if you have Acrobat already.

pikepdf (Python wrapper around qpdf) — free, scriptable, programmatic. Useful for batch workflows. See the complete guide to pikepdf for details.

LibreOffice Draw — free, GUI-based. Less elegant for merging than the dedicated tools but works as a fallback.

If you’re regularly merging PDFs in a workflow, set up qpdf or pikepdf as a script and stop using online services entirely. The setup time is one afternoon; the corrupted-file problem disappears permanently.

If your PDF was split rather than merged and the split pages are corrupted, PDF split pages are corrupted covers that scenario — the root causes overlap heavily, since it’s typically the same online tools producing both kinds of broken output. For broken PDFs from other origins, the PDF repair pillar lists symptom-specific guides for each.

Last verified: April 2026