Complete Guide to Ghostscript for PDF Recovery
Ghostscript is the tool you reach for after qpdf and pikepdf have failed. It does not repair PDFs in the structural sense those tools do — it re-interprets the file as a stream of drawing instructions and writes a brand new PDF from scratch. This works on some severely damaged files because it does not depend on the original structure being parseable. The trade-off is significant and is worth understanding before you run it: Ghostscript routinely strips form fields, annotations, digital signatures, bookmarks, tagged accessibility structure, and JavaScript actions. What you get back looks like the original document but is, structurally, a different file with much less in it.
If your PDF is a printable document and you only need the visual content, Ghostscript usually works. If your PDF is a fillable form, a signed contract, or a tagged accessible document, Ghostscript will give you back something you didn’t want.
This guide covers the licensing situation you should understand before using Ghostscript commercially, installation across platforms, the repair and optimization recipes that actually work, what is lost during re-rendering, and how to read the output.
When to use Ghostscript
Ghostscript is the right tool for:
PDF recovery when structural repair has failed. qpdf reports the file as unrecoverable, pikepdf throws PdfError, but the file still opens partially in some readers — Ghostscript can sometimes interpret enough of the content stream to write a working file.
File size reduction beyond what structural tools achieve. Ghostscript’s PDFSETTINGS presets re-encode images and subset fonts aggressively. Reductions of 5-10x are common on image-heavy PDFs.
Converting between PDF versions. Producing a PDF/A archival file, downgrading a PDF 2.0 file to 1.4 for compatibility with older readers, or normalizing a mixed corpus to a single version.
Re-rasterizing a problematic PDF. When the original has rendering quirks — missing fonts, transparency that breaks specific viewers, unusual color profiles — re-rendering through Ghostscript produces a more conservative output that displays consistently.
Stripping interactivity from a sensitive document. As an unintended side effect, Ghostscript removes JavaScript and form actions. This is sometimes desirable for security review.
Ghostscript is not the right tool for:
First-line PDF repair. Try qpdf first. It preserves everything Ghostscript loses and works on the more common failure modes.
PDFs you need to keep editable. Form fields, annotations, comments, and bookmarks are commonly lost.
Signed PDFs you need to keep verified. Digital signatures are always invalidated by Ghostscript re-rendering. There is no flag that preserves them — re-rendering by definition produces a different byte sequence and breaks the signature.
Accessible PDFs. Tagged structure for screen readers does not survive re-rendering.
Production use without checking the license. Ghostscript’s open-source license is AGPL v3, which has real implications for any product or service that distributes Ghostscript or makes it available to users over a network. See the licensing section below.
A note on licensing
Ghostscript is dual-licensed by Artifex Software: the open-source distribution is AGPL v3, and a commercial license is available for separate purchase. This matters more than it first appears.
The AGPL is a strong copyleft license. If you distribute Ghostscript with your software, modify Ghostscript, or — critically — make Ghostscript available to users over a network as part of a service, the AGPL requires you to release the source code of your product under the same license. Many SaaS products, server-side PDF processing services, and embedded uses of Ghostscript fall under this requirement.
For personal use, internal use within an organization, and use in clearly AGPL-compatible open-source projects, the AGPL is fine. For anything else, talk to Artifex about a commercial license, or use one of the alternatives at the end of this guide. This guide assumes you have either confirmed the AGPL is acceptable for your use or have obtained a commercial license.
Installation
macOS
Homebrew installs Ghostscript and all its dependencies:
brew install ghostscript
The command-line binary is gs. Verify with:
gs --version
Linux
Ghostscript is in every major distribution’s package repository:
# Debian, Ubuntu, Mint
sudo apt install ghostscript
# Fedora
sudo dnf install ghostscript
# Arch
sudo pacman -S ghostscript
Distribution packages may lag the upstream release. If you need a specific recent version, the Artifex GitHub releases page has source tarballs for building from source.
Windows
Download the official installer from ghostscript.com and choose the AGPL release for personal use or the commercial release if you have a license. Both 32-bit and 64-bit installers are provided; the 64-bit installer is the right choice for any modern Windows system.
After installation, the executable is named gswin64c.exe (the c stands for “console” — the GUI variant gswin64.exe is rarely needed for repair work). The installer optionally adds Ghostscript to your PATH; if it doesn’t, you’ll need to either add it manually or invoke the full path.
Verify with:
gswin64c --version
In the recipes below, examples use gs for brevity. Substitute gswin64c on Windows.
Common recipes
All examples assume input.pdf is the file you want to recover or transform.
Recover a damaged PDF by re-rendering
The basic recovery pattern. Ghostscript reads the input and writes a fresh PDF using its own pdfwrite device:
gs -o output.pdf -sDEVICE=pdfwrite input.pdf
The -o flag is shorthand for -sOutputFile= plus the standard non-interactive flags -dNOPAUSE -dBATCH. For explicit control, the full form is:
gs -dNOPAUSE -dBATCH -dSAFER -sDEVICE=pdfwrite -sOutputFile=output.pdf input.pdf
-dSAFER is a security flag that disables PostScript file-system access during processing. It is the right default for any file you didn’t write yourself. As of Ghostscript 9.50, -dSAFER is the default behavior — but including it explicitly does no harm and makes the command’s intent clear.
If Ghostscript can read enough of the input to interpret it, the output is a clean PDF containing the visual content of the original. Form fields, annotations, signatures, and bookmarks will not be in the output.
Reduce file size with PDFSETTINGS presets
Ghostscript offers four named presets that adjust image resolution, color compression, and font subsetting:
# Heavy compression, screen-quality images
gs -o smaller.pdf -sDEVICE=pdfwrite -dPDFSETTINGS=/screen input.pdf
# Moderate compression, ebook-quality
gs -o smaller.pdf -sDEVICE=pdfwrite -dPDFSETTINGS=/ebook input.pdf
# Light compression, print-quality
gs -o smaller.pdf -sDEVICE=pdfwrite -dPDFSETTINGS=/printer input.pdf
# Minimal compression, prepress-quality
gs -o smaller.pdf -sDEVICE=pdfwrite -dPDFSETTINGS=/prepress input.pdf
/screen produces the smallest files at 72 DPI; /prepress preserves the most quality at 300 DPI. /ebook at 150 DPI is a sensible default for general-purpose compression.
Downgrade to an older PDF version
For compatibility with older readers or specific compliance requirements:
gs -o old-version.pdf -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 input.pdf
Valid values run from 1.3 through 2.0. PDF 1.4 (Acrobat 5 era) is the most widely-compatible modern target. PDF 1.7 is the ISO 32000-1 standard. PDF 2.0 is the latest revision.
Convert to PDF/A for archival
PDF/A is a subset of PDF designed for long-term archival. Ghostscript can produce PDF/A-1, PDF/A-2, and PDF/A-3 conformant output:
gs -dPDFA=1 -dBATCH -dNOPAUSE -dSAFER -sProcessColorModel=DeviceRGB \
-sDEVICE=pdfwrite -sPDFACompatibilityPolicy=1 \
-sOutputFile=archival.pdf input.pdf
-sPDFACompatibilityPolicy=1 tells Ghostscript to flag content that does not conform to PDF/A but to continue processing. Set to 2 to abort on non-conforming content.
Re-render only a range of pages
The -dFirstPage and -dLastPage options restrict which pages are processed:
gs -o pages-5-10.pdf -sDEVICE=pdfwrite -dFirstPage=5 -dLastPage=10 input.pdf
This is useful when most of a damaged PDF re-renders cleanly but a specific range produces errors — you can isolate and exclude the problem pages.
Concatenate multiple PDFs into one
Ghostscript can merge PDFs as a side effect of re-rendering them all into a single output:
gs -o merged.pdf -sDEVICE=pdfwrite a.pdf b.pdf c.pdf
This works but loses the same metadata that re-rendering always loses. For form-preserving merge, use qpdf or pikepdf.
Convert PDF pages to images
The pngalpha device renders to PNG with transparency support. One file per page using a %d placeholder for the page number:
gs -o page_%03d.png -sDEVICE=pngalpha -r150 input.pdf
-r150 sets the resolution to 150 DPI. Common alternatives: -r72 for screen preview, -r300 for print quality. The output device controls format: jpeg, tiff24nc, pnggray for grayscale PNG, and others.
What Ghostscript loses during re-rendering
This is the section to read before deciding whether Ghostscript is the right tool. The list of things that do not survive re-rendering:
Interactive form fields (AcroForms and XFA). A blank form with empty fields will come back as a flat image-like PDF with nothing fillable.
Annotations and comments. Highlights, sticky notes, text comments, and review markup are removed.
Digital signatures. The cryptographic data is stripped. Even if a “signature” appears visible in the output, it is a flattened image with no validation.
Bookmarks (the navigation tree). The outline that lets readers jump between sections of long documents is usually lost. Some Ghostscript builds preserve a subset; do not rely on it.
Tagged structure for accessibility. PDF tags that screen readers depend on are removed. The output is visually identical but inaccessible.
JavaScript and embedded actions. Any scripted behavior in the original is gone. (This is occasionally what you want.)
File attachments. Embedded files inside the PDF are not preserved.
Layers (Optional Content Groups). Layered PDFs are flattened to a single layer.
XMP metadata. Some metadata is preserved, but custom XMP namespaces are typically not.
If any of the above matters for your use case, do not use Ghostscript as a repair tool without a fallback plan for restoring the lost content.
Reading Ghostscript’s output
Ghostscript is verbose by default. Understanding the messages helps decide whether the output file is trustworthy.
A clean re-render produces output like:
GPL Ghostscript 10.08.0 (2025-08-26)
Copyright (C) 2025 Artifex Software, Inc. All rights reserved.
Processing pages 1 through 12.
Page 1
Page 2
...
Page 12
No warnings, exit code 0. The output file is generally trustworthy.
When Ghostscript encounters problems but recovers:
**** Error: An error occurred while reading an XREF table.
**** The file has been damaged. This may have been caused
**** by a problem while converting or transmitting the file.
**** Ghostscript will attempt to recover the data.
Ghostscript continues and produces output. The output is usually usable but spot-check it against the original (or whatever you remember of the original) before trusting it. Recovery is best-effort and silent omissions are possible.
When Ghostscript cannot proceed:
**** Unrecoverable error: undefined in <something>
Exit code is non-zero. The output file, if any was written, is incomplete and should not be used.
The -dPDFSTOPONERROR flag changes the behavior to abort on the first error rather than attempting recovery — useful when you want to be told immediately that a file is broken rather than getting partial output.
Limitations and known issues
Re-rendering is not repair. Ghostscript does not fix the original file. It produces a new one based on its interpretation of the original. The original is unchanged.
Loss of fidelity is structural, not accidental. The losses listed above are inherent to the re-rendering approach. No flag changes them.
Font rendering can shift. If the original used embedded fonts that Ghostscript cannot fully interpret, the output may use substitute fonts. Letter spacing and line breaks can change subtly.
Color management can shift. Ghostscript’s color handling is sophisticated but does not always exactly match Acrobat’s. Critical color work (print prepress, brand colors) should be inspected after re-rendering.
AGPL licensing affects redistribution. Re-stating from above because it matters: the open-source license is not permissive. Server-side and SaaS use is restricted. Get a commercial license or use an alternative if AGPL terms don’t fit your situation.
Memory usage scales with document complexity. Heavily-illustrated or large PDFs can exhaust memory. Consider page-range processing for very large files.
Not actively developed for repair-specific use cases. Ghostscript’s primary design goal is PostScript and PDF interpretation for printing and conversion, not repair. Repair behavior is a side effect of re-rendering, not a designed feature.
Alternatives
qpdf is the right first choice for PDF structural repair. It preserves form fields, annotations, signatures, and bookmarks. It is faster, has a permissive license, and handles the more common failure modes. Use Ghostscript only after qpdf has failed. See the complete guide to qpdf.
pikepdf is qpdf’s Python wrapper. Same backend, same preservation behavior, programmatic interface. See the complete guide to pikepdf.
mutool clean (part of MuPDF) offers structural cleanup with mutool clean -ggg input.pdf output.pdf. Lighter-weight than Ghostscript, with similar limits to qpdf for genuinely damaged files. AGPL-licensed like Ghostscript itself.
Apache PDFBox has its own repair behavior built into the parser, accessible programmatically through the Java API. License is permissive (Apache 2.0). See the complete guide to Apache PDFBox.
Adobe Acrobat Pro has built-in recovery that runs automatically when you open a damaged PDF. For one-off recovery on a desktop, this is often the simplest path. Acrobat preserves more of the file’s structure than Ghostscript does.
Commercial GUI tools like Stellar Repair for PDF wrap similar techniques behind a desktop interface. Useful for non-technical users dealing with one urgent file. Not worth the recurring cost for repeat use when the underlying free tools achieve the same outcomes.
Last verified: April 2026