Excel Repair: A Complete Guide to Recovering Damaged Spreadsheets
A corrupted Excel file is a stressful find, especially if it holds the only copy of a quarterly model, a client’s data, or a report due today. The good news: most Excel corruption is recoverable, often without leaving Excel itself. The harder truth: some workbooks are genuinely beyond repair, and recognizing those early saves hours of effort.
This guide covers how Excel files are built, what goes wrong, how to fix the common cases in order of effort, which tools handle which problems, and when a file is gone for good.
Common problems
Most Excel problems fall into a handful of categories. If yours matches one of these, skip straight to the specific guide — each links to the fastest path to a working file.
The file won’t open at all. Excel may show “The file is corrupt and cannot be opened”, “Excel cannot open the file because the file format or file extension is not valid”, or simply hang on the splash screen. Start with the guide to Excel files that won’t open — Excel’s built-in Open and Repair handles a large fraction of cases.
Excel opens but shows empty sheets. The workbook structure is intact, but cell content is missing. In modern XLSX files this is almost always damage to a single XML part — usually the shared strings table. See the guide to Excel opening empty.
Excel crashes when opening the file. Different problem from “won’t open” — Excel actively crashes rather than refusing. Add-ins, hardware acceleration, or file-specific corruption that triggers parser faults. See the guide to Excel crashes on open.
A specific error message appears. Errors like "Excel found unreadable content", "The file is corrupt and cannot be opened", or "Cannot open the file because the file format or file extension is not valid" each point to different underlying problems. The error-specific guides for Excel give targeted fixes for each.
The file recovered but the content is broken. Formulas return #REF!, pivot tables won’t refresh, charts have lost their data sources, or cells display #####. These are post-recovery symptoms, not the original corruption. See Excel formulas broken after recovery, pivot tables broken after recovery, and charts missing or broken after recovery.
The file won’t open after OneDrive sync. Sync conflicts, partial uploads, and simultaneous edits across devices are a major source of Excel corruption in modern use. See the guide to Excel files corrupted by OneDrive sync.
The file is password-protected and won’t open. Two scenarios with very different prospects. If you have the password and the file rejects it, the encryption dictionary is likely damaged. If the password is lost, the recoverability depends entirely on the encryption algorithm. See password-protected Excel files.
An Excel file from email won’t open. Often not corruption at all — Protected View or Mark of the Web is blocking the file before Excel even tries to parse it. See Excel files from email won’t open.
A CSV imported into Excel shows garbled characters. Encoding mismatch, not corruption. See Excel CSV encoding garbled.
Understanding Excel files
To repair an Excel file effectively, it helps to know which of the two formats you’re dealing with. They are structurally unrelated, and the techniques that fix one rarely apply to the other.
XLSX (and XLSM, XLSB) is the modern format introduced with Office 2007. An XLSX file is a ZIP archive containing XML parts. If you rename a .xlsx file to .zip and open it in any archive tool, you can see the structure directly: a [Content_Types].xml file at the root, an xl/ directory containing workbook.xml (the workbook structure), xl/worksheets/sheet1.xml, sheet2.xml, and so on (one per sheet), xl/sharedStrings.xml (a deduplicated string table referenced by every cell containing text), xl/styles.xml (cell formatting), and xl/_rels/ (relationships between parts). This format is documented as Office Open XML in the ECMA-376 standard. XLSM is the same format with macros enabled; XLSB is a binary variant of the same logical structure.
XLS is the legacy format used by Excel 97 through 2003. An XLS file is an OLE2 Compound File Binary — a single binary container that holds multiple internal “streams” of data, each encoded in BIFF (Binary Interchange File Format). The container format is the same one used by older .doc Word files and .ppt PowerPoint files, which is why a single corruption pattern can affect all three formats.
This distinction matters because XLSX corruption is usually localized to one or two XML parts inside the ZIP, while XLS corruption is usually a defect in the binary container itself. For XLSX, you can often repair the file by extracting the ZIP, fixing the damaged XML, and repackaging — a process that requires care but no special tools. For XLS, the file is opaque without specialized parsers, and repair almost always means a third-party tool that understands the BIFF and OLE2 internals.
A few other structural details are worth knowing. The sharedStrings.xml table in XLSX is the equivalent of the cross-reference table in a PDF: every text cell points to an index in this table rather than storing the string directly. If sharedStrings.xml is damaged, every text cell in the workbook reads as empty or garbled, even though the sheet structure is intact. This is the most common cause of “Excel opens but shows empty sheets.”
Excel also writes AutoRecover snapshots during editing, separately from the file itself. These live in the AutoRecover folder (%AppData%\Microsoft\Excel\ on Windows, ~/Library/Containers/com.microsoft.Excel/Data/Library/Application Support/Microsoft/Office/UnsavedFiles/ on macOS) and persist for a configurable interval after an unexpected close. When Excel reopens after a crash, the Document Recovery panel surfaces these snapshots — often the fastest path to a working file when the original is damaged.
The practical upshot: an Excel file that looks broken usually has most of its data intact behind one or two specific structural problems. Knowing which file format you’re dealing with — and which part is damaged — determines which tool will help.
Why Excel files become corrupted
The actual causes, ranked by how often they’re the culprit:
OneDrive and SharePoint sync conflicts. The single largest source of Excel corruption in modern use. The file is open on one device, edited on another, syncs collide, and the result is a workbook with conflicting metadata, duplicate sheets, or a broken relationship between parts. OneDrive often saves a conflict copy named like filename-DESKTOP-XYZ.xlsx, which is sometimes the cleaner version. Always check the OneDrive folder for conflict copies before attempting any repair.
Improper close. Power loss, force-quit, system crash, or Excel crashing mid-save leaves the file in an inconsistent state. The save was partially written; some XML parts reflect the new content while others still reference the old. AutoRecover usually catches this, but not always — especially if the crash occurred during the save itself.
Email and cloud transfer corruption. Email gateways, antivirus scanners that strip macros, and cloud storage that re-encodes files in transit can all alter bytes in an XLSX. Even a single byte change can invalidate the ZIP container’s CRC checksums, making the file fail to extract before Excel ever sees it.
Network share issues. Files saved to network shares — especially over Wi-Fi or VPN — are vulnerable to write interruption. Two users opening the same file via SMB without proper file locking is another reliable source of corruption, particularly for the legacy shared workbook feature.
Add-in conflicts during save. Third-party add-ins that hook the save process can write malformed data, especially if the add-in version is incompatible with the Excel version. The symptom typically appears consistently when the same add-in is active and goes away when it’s disabled.
Macro corruption. A damaged VBA project can prevent the entire workbook from opening, even though the worksheet data is intact. The fix often involves opening the file in a tool that ignores VBA — LibreOffice Calc will open the workbook and let you save without macros — and then re-creating the macros if needed.
Memory exhaustion in 32-bit Excel. Large workbooks open in 32-bit Excel can hit a 2 GB memory ceiling and crash mid-save, leaving a partial file. Migration to 64-bit Excel is usually the lasting fix.
Storage media errors. Failing drives, corrupt USB sticks, and bad sectors return wrong bytes when the file is read or written. Often irreversible without a backup.
Conversion artifacts. Files converted from XLS to XLSX (or back) by tools other than Excel itself sometimes produce technically-valid but oddly-structured files that one application opens and another rejects.
The repair tool landscape
No single tool handles every Excel problem. The tools below are the ones you’ll actually reach for, with honest notes on each.
Excel’s Open and Repair. Built into Excel and almost always the first thing to try. In the File > Open dialog, select the damaged file with a single click — don’t double-click — then click the arrow next to the Open button and choose “Open and Repair.” Excel offers two paths: Repair (attempt to recover the file as-is) and Extract Data (recover values and formulas only, discarding formatting and structure). Try Repair first; if it fails, Extract Data is a useful fallback for getting the numbers out even if the workbook itself is unsalvageable. Frequently overlooked because the menu option is buried; effective on a wide range of common corruption. See the complete guide to Microsoft’s Open and Repair feature.
Excel’s AutoRecover and Document Recovery panel. When Excel reopens after an unexpected close, the Document Recovery panel surfaces snapshots saved during the previous session. Check the AutoRecover folder directly if the panel doesn’t appear: %AppData%\Microsoft\Excel\ on Windows, or use File > Info > Manage Workbook > Recover Unsaved Workbooks within Excel itself. These snapshots are often only a few minutes old and may contain the work you thought was lost.
LibreOffice Calc. Free and open-source, with notably more permissive parsing than Excel. Calc routinely opens damaged XLSX files that Excel refuses. The pattern: open the file in Calc, save it back as XLSX (or save through ODS first if Calc reports problems), then reopen in Excel. Calc’s headless mode (soffice --headless --convert-to xlsx damaged.xlsx) supports batch repair across many files. Strong tolerance for malformed XML inside the ZIP container; less effective on severely damaged XLS files. See the complete guide to LibreOffice for Word and Excel repair.
Manual XLSX surgery. For XLSX files specifically, you can sometimes repair the file by hand. Rename the .xlsx to .zip, extract the contents, identify the damaged XML part (usually xl/sharedStrings.xml, xl/workbook.xml, or one of the xl/worksheets/sheetN.xml files), repair or replace it, and rezip. The damaged part is typically obvious — XML that doesn’t parse, or a file that’s truncated. This is laborious and error-prone but works for cases that automated tools can’t handle. Requires familiarity with the OOXML structure and a careful hand on the rezip step (the ZIP’s central directory must be regenerated correctly).
Stellar Repair for Excel. A commercial GUI tool aimed at users who want a point-and-click solution. Handles XLS and XLSX. The underlying techniques overlap significantly with what LibreOffice and manual XLSX surgery achieve for free; the value is the GUI and the support contract. Worth considering for an urgent one-off file; the licence cost adds up for repeated use. See the complete guide to Stellar Repair for Excel.
DataNumen Excel Repair. A long-established commercial tool with particularly strong handling of legacy XLS (BIFF) files. If you’re dealing with old XLS files that no other tool can read, DataNumen is more likely to recover something than the alternatives. Less differentiated for modern XLSX. See the complete guide to DataNumen Excel Repair.
Wondershare Recoverit and Repairit. Multi-format commercial tools that include Excel repair alongside other file types. Useful if you have a mix of damaged file types and want a single tool; not the strongest dedicated option for Excel specifically.
Open XML SDK (developer-level). Microsoft’s official library for reading and writing OOXML files programmatically. Not a “tool” in the user sense, but the right answer for organizations that need to repair many damaged files in an automated pipeline. Combined with Microsoft’s Productivity Tool for Office, it can diagnose and partially repair OOXML structural problems.
Python: openpyxl, pylightxl. For programmatic access to XLSX. openpyxl is the most actively maintained and is appropriate when you need to extract data from a workbook that won’t fully open but is structurally readable. For legacy XLS, xlrd is read-only and supports only XLS (it dropped XLSX support in version 2.0); pair it with xlwt if you need to write XLS.
When an Excel file can’t be repaired
Some workbooks are genuinely beyond recovery. Recognizing these early saves time.
The file is severely truncated. If significant trailing content is missing — for example, a download that stopped at 2 MB of an 8 MB file — much of the workbook content is gone. For XLSX, the ZIP container’s central directory sits at the end of the file; if it’s missing, the ZIP cannot be parsed at all. Some partial recovery is sometimes possible from a truncated ZIP using forensic tools, but full recovery is not.
The file is encrypted with a strong password that’s been lost. Modern Excel uses AES-128 or AES-256 encryption applied via PBKDF2 with substantial iteration counts. Brute-force recovery of a password of meaningful length is computationally infeasible. Older XLS files (Excel 97-2003) used much weaker protection that recovery tools can defeat, but modern XLSX files cannot be opened without the password. Be sceptical of any tool that claims to “recover” a strong AES password without a wordlist or a known partial.
Ransomware-encrypted files. A workbook re-encrypted by ransomware looks like corruption but is actually a different file format entirely. Repair tools will not help; recovery requires the decryption key, a backup, or shadow copies the malware didn’t reach.
The OLE2 container is fundamentally damaged (XLS). XLS files corrupted at the container level rather than within a specific stream are particularly hard to recover. DataNumen and Stellar both attempt this with mixed success; complete reconstruction is rarely possible.
The file was never a valid Excel file. Occasionally a file is saved with an .xlsx extension but is actually an HTML page (a common pattern when downloading “as Excel” from web applications), an empty file, or a CSV. Checking the first few bytes settles this — XLSX starts with PK (the ZIP signature), XLS starts with the OLE2 magic bytes D0 CF 11 E0 A1 B1 1A E1. A few seconds with a hex viewer rules this out.
For genuinely unrecoverable files, the practical options are: restore from backup, request a fresh copy from the original source, recover from OneDrive or SharePoint version history (which retains older versions even after corruption), or — for files that were open recently — recover from Excel’s AutoRecover folder before it overwrites the snapshot.
Related categories
Excel problems often connect to other formats. If you’re dealing with an Excel file that was generated from a Word mail merge or a Word table import, the Word repair guide covers fixing the source. For Excel files that won’t extract from a damaged ZIP archive — or for the reverse case, where renaming an XLSX to .zip won’t extract — the archive repair guide covers the ZIP-level diagnosis. PDF exports from damaged Excel files inherit the damage; the PDF repair guide covers what can be done at the PDF stage. For background on how file corruption arises across all formats, see understanding file corruption: a general guide.
Last verified: April 2026