PDF

PDF Bookmarks, Annotations, or Form Data Missing After Repair

You opened the recovered PDF and the bookmarks are gone, the form fields are flat, the annotations have vanished, or the digital signature is showing as invalid. This is one of the most frustrating outcomes of PDF recovery, and the honest answer is that some of these features can be partially reconstructed and others cannot. What follows is the practical recovery path, ordered from most likely to work to least, and a clear explanation of why digital signatures specifically are unrecoverable by any tool.

Quick fix

The single highest-success action is to retry the repair on the original damaged file using a structural repair tool that preserves these features, rather than the destructive tool that lost them.

Locate the original damaged PDF — the file as it was before any repair was attempted. If you overwrote it, check your backup, your email’s Sent folder, or the application that produced it.
Run qpdf --object-streams=preserve damaged.pdf recovered.pdf against the original. This rebuilds the cross-reference table without re-rendering page content, preserving all non-visual objects.
If that succeeds, the new file should contain your bookmarks, annotations, and form data intact. Open it and verify before discarding the previous recovered version.
If qpdf reports operation succeeded with warnings, the file is still likely usable. If it returns file is damaged and cannot be processed, structural repair has reached its limit and the per-feature reconstruction options below apply.

If you no longer have the original damaged file — only the post-recovery version with features missing — there is no way back to those features through the recovered file alone. The next sections describe what is possible if the original is still available.

If that didn’t work

When structural repair on the original fails or partially succeeds, individual features can sometimes be extracted from the original and grafted into the recovered file. Each feature has a different recovery path.

Form data

Use pdftk-java (the maintained fork of pdftk) to extract form field data from the original as an FDF file, then apply it to the recovered version:

pdftk damaged-original.pdf generate_fdf output data.fdf
pdftk recovered.pdf fill_form data.fdf output recovered-with-data.pdf

This works as long as the damaged original is readable enough for pdftk to parse the form structure, which is often the case even when the file won’t render correctly in Acrobat. The form field names must match between the damaged and recovered files; if the recovered file went through a destructive re-render, field names may have been replaced with auto-generated equivalents and the FDF data won’t match.

For documents using the older XFA form format (common in some government and enterprise PDFs), this approach often fails — XFA is an XML-based form layer that doesn’t survive re-rendering at all and is not handled cleanly by pdftk. There is no reliable recovery path for XFA forms once they have been destroyed by re-rendering.

Bookmarks (outlines)

If the original is partially readable, pikepdf can copy the outline tree from one file to another. The general approach uses the open_outline() context manager to access and modify the outline structure. Consult the current pikepdf documentation for the exact API, as the outline interface has evolved across versions.

The important caveat: bookmarks reference specific pages and coordinates within the destination file. If the recovered file has a different page count, different page sizes, or pages re-rendered to different coordinate spaces (which Ghostscript output commonly produces), bookmark targets will land at the wrong locations. Manual repair after transplant is usually required.

Annotations

This is the hardest case. Annotations include sticky notes, highlights, comments, and form-like markup, and each is anchored to specific page coordinates. When the recovered file was produced by re-rendering, those coordinates no longer correspond to the same visual locations, so even if you successfully transplant the annotation objects, they will appear in the wrong places on the page.

The pdfannots tool can extract annotation content as plain text from a partially-readable original. This gives you the textual content of comments and highlights, which you can then re-create manually in the recovered file using Acrobat or another PDF editor. Tedious, but for documents with critical review comments, sometimes the only option.

If the recovered file came from structural repair (qpdf, pikepdf) rather than re-rendering, annotation transplant is more reliable because the coordinate spaces typically match.

Digital signatures

Cannot be recovered. This is the most common point of confusion in PDF recovery, and the answer is unambiguous: a digital signature cryptographically covers a specific byte sequence. Any modification to those bytes — including a structural repair that produces an otherwise-identical file — invalidates the signature. There is no software workaround. Tools that claim to “repair” signatures are removing the signature requirement, not restoring cryptographic validity.

If signature validity is essential for legal, audit, or compliance reasons, the only path forward is to obtain a fresh copy from the original signer and have them re-sign the corrected document.

Advanced recovery

For valuable documents where the original cannot be opened by any structural tool, a hybrid approach sometimes salvages more than either tool alone.

The technique combines two outputs: use Ghostscript’s re-render as the visual base (because it produces a file that opens cleanly), then graft surviving feature objects from the original into it. This requires reading the damaged original at the binary level, identifying intact /Annot, /Outlines, and /AcroForm objects by their type markers, and using pikepdf to insert them into the rendered output. Coordinate misalignment will remain a problem and may require manual adjustment.

This is hours of work for a single document. It is worth it for an irreplaceable contract or signed agreement. It is not worth it for a routine file. Before going down this path, exhaust the alternatives: contact the document’s source for a fresh copy, check version history on cloud storage (OneDrive, Google Drive, Dropbox all retain previous versions for some period), or check whether the document exists in another format upstream.

Why this happens

PDF repair tools fall into two fundamentally different categories, and the difference between them determines which features survive.

Structural repair tools — qpdf, pikepdf, Adobe Acrobat’s built-in repair, Microsoft’s Office repair feature — read the file’s existing objects, fix the cross-reference table that maps to them, and write a new file containing the same objects with a corrected structure. Bookmarks, annotations, form fields, and signature widgets are objects in this sense, so they survive intact. The visual appearance of a digital signature is preserved, but the cryptographic validity is invalidated by any change to the byte sequence, including benign repairs.

Re-rendering tools — Ghostscript’s pdfwrite device, online “repair” services that produce visual output, OCR-based recovery, some commercial GUI repair tools — read what they can render and produce a new file by re-creating the visible content from scratch. Non-visual objects do not survive because they are not part of the rendered output. Form fields become flat printed text. Annotations vanish. Bookmarks reference a structure that no longer exists in the new file. Signatures are gone because there is no original byte sequence to verify against.

Re-rendering is sometimes necessary. When structural repair fails because content streams are damaged rather than just the index, re-rendering may be the only way to salvage the visible content. But it is a tradeoff with a real cost, and most users discover that cost only after the fact.

The asymmetry between structural and destructive recovery is the most important thing to understand about PDF repair tooling. The first choice you make about which tool to run determines what features will exist when the dust settles.

Preventing this in future

Three habits significantly reduce the risk of feature loss during PDF recovery.

Always try structural repair first. Run qpdf --object-streams=preserve or qpdf --linearize before reaching for Ghostscript or any tool that re-renders. If structural repair succeeds, all your non-visual features are preserved. If it fails, you have lost nothing — the original file is unchanged and the destructive options are still available.

Keep the original. Before any repair attempt, copy the damaged file to a separate location with a clear name like original-damaged.pdf. Many tools modify in place by default, and a failed repair on the only copy is a much worse situation than a successful repair on a backup. The disk space cost of keeping a copy is trivial compared to the time cost of trying to undo a destructive operation.

For valuable signed documents, accept that repair invalidates signatures and plan accordingly. A signed PDF that has been repaired is no longer cryptographically valid even if the visual appearance is intact. If a signature must remain legally valid, no amount of repair work substitutes for a fresh signed copy. Build that into your workflow before the document needs repair, not after.

For the full context on PDF repair tool choice, see the complete PDF repair guide, which covers when each category of tool is appropriate. The complete guide to qpdf walks through the structural repair recipes that preserve these features. If you have not yet attempted any repair and are choosing your first tool now, the guide to PDFs that won’t open covers the structural-first sequence in full. For the specific case of a digital signature reporting as invalid after any modification, the guide to PDF signed-document-altered errors explains what verification failures mean and what your remaining options are.

Last verified: April 2026