Bearpoint

Document Spoliation Detection

Forensic methods for catching destroyed, altered, or sanitized evidence in public records and discovery productions, and for building the spoliation record a court will credit.

Bearpoint Foundation provides comprehensive research so a lawyer can litigate. We do not provide legal services and nothing on this page is legal advice. Sanctions standards under FRCP 37(e) and state equivalents continue to evolve. Always verify the current spoliation case law in the relevant jurisdiction before relying on any inference standard.

What Spoliation Is (Legally)

Spoliation is the destruction, alteration, or failure to preserve evidence that a party knew or reasonably should have known was relevant to actual or reasonably anticipated litigation. Remedies are graduated: monetary sanctions and fee-shifting at the low end, an adverse-inference instruction in the middle (the jury may, or in some jurisdictions must, presume the missing evidence would have been harmful to the spoliator), and default judgment, claim preclusion, or dismissal at the extreme.

The modern federal framework is Federal Rule of Civil Procedure 37(e), rewritten in 2015 to govern lost ESI. It requires that the information should have been preserved, that a party failed to take reasonable steps to preserve it, that it cannot be restored through additional discovery, and that another party was prejudiced or that the spoliator acted with intent to deprive. Leading federal decisions: Zubulake v. UBS Warburg LLC, 220 F.R.D. 212 (S.D.N.Y. 2003); Silvestri v. Gen. Motors Corp., 271 F.3d 583 (4th Cir. 2001).

State analogs vary. Washington recognizes spoliation as a discretionary trigger for sanctions or adverse inference under Henderson v. Tyrrell, 80 Wn. App. 592 (1996), refined in Cook v. Tarbert Logging, Inc., 190 Wn. App. 448 (2015). Confirm the venue's rule before drafting a motion.

The Print-To-Scan Pattern

The most common spoliation pattern in public records productions is the print-to-scan workflow. A native digital document (Word, native PDF, email) is printed to paper, re-scanned, and the resulting image-only PDF is produced. Everything evidentiarily valuable is stripped in the round trip: metadata, redaction layers, tracked changes, author fields, revision history, timestamps, embedded fonts, hyperlinks, and email headers.

The indicators are consistent. The PDF Producer field will identify a scanner brand (Fujitsu ScanSnap, Xerox WorkCentre, Canon imageRUNNER, HP ScanJet). No fonts are embedded. An OCR layer may be present, but original encoded text is absent. Pages exhibit a consistent skew angle and shadow pattern across allegedly unrelated documents, betraying a single scanner session.

Metadata Layer Analysis

Every native document carries metadata. PDFs expose XMP and document-info dictionaries. Office files (DOCX, XLSX, PPTX) carry core.xml and app.xml with author, last-modified-by, creation date, total editing time, and revision number. JPEGs carry EXIF with camera model, date, and often GPS. Email .eml and .msg files preserve the full RFC 5322 header chain.

The standard tool is ExifTool. First pass: exiftool -a -G -s filename.pdf prints all tags grouped by namespace. Batch a folder: exiftool -r -csv -G -s ./production/ > metadata.csv. Sort the CSV by creation date. Identical creation timestamps across allegedly different files prove single-session production. Identical last-modified-by names across files attributed to different authors prove single-author origination. Missing metadata where it should exist proves stripping.

The Byte-Identical Duplicate Pattern

Agencies under public records pressure sometimes pad productions with byte-identical copies of the same file under different names to inflate apparent response volume. SHA-256 hash every file and group by hash:

Any duplicate pair in a production warrants investigation. Innocent duplicates cluster predictably (a single attachment cited in multiple emails). Random scatter of identical hashes across folders labeled with different subject matters is not innocent. Document the hash, filenames, and paths in a manifest.

The Native-Format Capability Test

Agencies routinely claim they cannot produce records in native format. The claim is easy to disprove: find one native file the same agency produced elsewhere. A board agenda posted as a native PDF to the public website. A budget document on a transparency portal. A native Excel attached to a prior PRA response. One example proves the capability existed; the choice to scan was selective, and the selectivity is itself evidence of intent. Capture the example with a screenshot, the URL, and a hashed download. It becomes Exhibit A in any motion under FRCP 37(e)(2).

Email Header Forensics

Produced emails should arrive as native .eml or .msg files with the full RFC 5322 header chain intact: Received path, Message-ID, Date, From, To, Cc, In-Reply-To, References, DKIM-Signature, SPF and DMARC results, X-Mailer, Return-Path, and organization-specific X-headers. When a production strips all of that and gives you only From, To, Subject, Date, and body as printed text on a scanned page, you are looking at a sanitized representation of an email, not an email.

Demand .eml or .msg. The claim that a mail system cannot export natives is false for every commercial platform deployed in the last fifteen years. Make the agency say otherwise in writing, then attach the writing to the sanctions motion.

The Timeline Construction Technique

Extract creation and modified timestamps from every produced file and plot them on a single timeline. Clustering reveals production sessions. When ninety files all show creation timestamps within a fifteen-minute window the day before delivery, those files were generated in a mass scan event after the request was filed. An agency producing records that originated years earlier but all show creation timestamps the day before production has signed a written confession of the print-to-scan workflow.

PDF Forensic Techniques

PDF analysis is a small toolchain. From Poppler: pdfinfo file.pdf prints the document-info dictionary (Producer, Creator, Author, CreationDate, ModDate, PDF version). pdfimages -list file.pdf enumerates every embedded image stream. pdftotext file.pdf - dumps the text layer to stdout. The signatures break down cleanly:

For deeper inspection, qpdf --qdf --object-streams=disable file.pdf out.pdf rewrites the PDF in a human-readable form, exposing redaction overlays that were rendered but not flattened and hidden text under black bars.

Chain of Custody From Receipt Forward

The forensic work is only as good as the chain of custody behind it. Hash every file on receipt and log the SHA-256 with a timestamp. Store originals in a read-only mirror; analyze copies. The custody log should record file name, path on receipt, SHA-256, file size in bytes, receipt date and time, source, and method of transfer (portal, email attachment, certified mail with disc). This is the record that authenticates the evidence under Federal Rule of Evidence 901 and analogous state rules. Without it, every forensic finding is impeachable at trial.

Building the Spoliation Record for Court

Technical findings are necessary but not sufficient. Combine them with admission requests under FRCP 36 (or the state equivalent). Sample admissions: "Admit that production of records was performed by scanning printed copies of native electronic files." "Admit that the agency retains the native files corresponding to the produced scanned images." "Admit that the agency possesses the technical capability to export emails in native .eml or .msg format." A refusal to admit becomes a discovery dispute and a basis for cost-shifting under Rule 37(c)(2). An admission becomes the spoliation predicate.

Adverse-inference instruction language varies by jurisdiction. Federal courts in many circuits draw on Residential Funding Corp. v. DeGeorge Fin. Corp., 306 F.3d 99 (2d Cir. 2002), for the framing that the spoliator's culpable state of mind and the relevance of the destroyed evidence are jury questions once the predicate is established. The instruction the jury hears is drafted from the venue's pattern jury instructions, not from case law directly.

Independent Enforcement Vectors

Beyond the underlying civil case, spoliation of public records triggers independent enforcement paths. State auditors have jurisdiction over destruction violating statutory retention schedules. State archives offices oversee retention compliance. Criminal referrals are available for tampering with records: federally under 18 U.S.C. § 1519, reaching anyone who knowingly alters, destroys, conceals, or falsifies any record with intent to impede any matter within federal jurisdiction. Washington's analog is RCW 9A.72.150, tampering with physical evidence.

Common Mistakes

When to Bring in Bearpoint

The Foundation provides forensic-grade research and chain-of-custody architecture. We do not litigate. We hand counsel a hashed, manifest-indexed evidence set and a documented spoliation predicate. The lawyer's hours go to motion practice, not triage. Email info@bearpointfdn.org.

Need Help Auditing a Production?

Bearpoint Foundation works with attorneys, advocates, and pro se litigants on forensic analysis of public records and discovery productions. No legal representation. Evidence architecture and methodology only.