Security & Privacy

What's Hidden in Your PDF and How to Remove It

PDF files are rarely as private as they appear. Beneath the visible content lies a layer of metadata that can reveal your name, your employer, the software you used, how long you spent editing, and sometimes much more.

June 5, 20266 min 阅读

What is PDF metadata?

Metadata is data about data. In the context of a PDF file, it is a collection of properties stored alongside the visible content that describes the document itself: who created it, when it was created and last modified, what software was used, what the file is called internally, and more.

The PDF specification includes two places where metadata is stored. The first is the Document Information Dictionary, a simple key-value structure that holds properties like Author, Title, Subject, Keywords, Creator (the authoring application), and Producer (the PDF library that generated the file). The second is XMP (Extensible Metadata Platform), a more structured XML-based metadata format embedded in the PDF stream, which can hold significantly more information.

Most PDF viewers do not show this metadata by default. You would typically need to open the File Properties dialog in Adobe Reader, or use the PDF Info tool on iKeepPDF, to see what is stored.

What personal information is commonly embedded in PDFs

Author name: When you create a PDF from Microsoft Word or similar software, the application automatically inserts the name registered to your copy of the software — typically your full name as it appears in your operating system user account. This is often the first thing a forensic investigator examines.

Organisation name: Word and Outlook include your company name from the Office licence registration. A PDF exported from a work computer may reveal your employer even if you intended to share the document anonymously.

Revision history: Some applications store how many times a document was revised. This can reveal that a document underwent 47 draft revisions before being shared, which might be commercially or legally significant.

Creation and modification timestamps: These precise timestamps can be important in legal contexts — they reveal when a document was first created and when it was last edited, which sometimes matters more than the document date printed in the visible content.

Software version and operating system: The Creator field often contains the exact software version that generated the file, and in some cases the operating system. This can be used to fingerprint your working environment.

Embedded images with GPS data: If your PDF contains photos taken on a smartphone, those images may include EXIF metadata with GPS coordinates indicating exactly where the photo was taken. This metadata is preserved when images are embedded in PDFs.

提示: Before sharing any PDF externally — especially for legal, journalistic, or sensitive professional purposes — check its metadata with the PDF Info tool and clean it with the Remove Metadata tool.

Real-world cases where metadata caused problems

The most famous metadata leak in recent history involved a leaked document from a major political party in the early 2000s. Investigators analysed the Word document's metadata and found it had been edited by multiple people at a government contractor — which contradicted the official story about the document's origin. The revision metadata became crucial evidence.

In legal proceedings, document metadata is routinely examined to verify authenticity. A contract that appears to have been created years ago but whose metadata shows a creation date after the alleged signing date raises serious questions. Courts have dismissed documents as forgeries based on metadata inconsistencies.

Whistleblowers and journalists face particular risk. A document shared anonymously that contains the author's full name in the metadata is not anonymous at all. Several sources have been identified and endangered because they did not clean metadata before sharing documents with reporters.

Remove all metadata from your PDF instantly — fully private, runs in your browser.免费试用

How to check and remove metadata from a PDF

You can inspect a PDF's metadata at any time using iKeepPDF's PDF Info tool. It shows everything stored in the Document Information Dictionary and the XMP packet: author, title, creation date, modification date, creator application, and producer library.

To remove the metadata, use the Remove Metadata tool. It strips all fields from both the Document Information Dictionary and the XMP stream, and neutralises creation and modification timestamps. The visible content of the PDF is completely unchanged — the tool only affects the hidden metadata layer.

The entire process runs locally in your browser. Your PDF is never uploaded to a server, which is particularly important when you are cleaning sensitive documents — you would not want to upload a confidential file to a third-party service just to remove identifying information.

What metadata removal does not clean

Removing metadata from a PDF does not remove personally identifying information embedded in the visible content itself — your name printed at the top of a letter, a signature image, or a watermark. Those are part of the document's visible layer and require a different approach (such as the Redact tool) to remove.

It also does not remove printer steganography — an invisible pattern of tiny yellow dots that some colour laser printers embed in every page to identify the printer and timestamp the print. This physical fingerprinting exists in the printed copy, not the PDF file itself.

For truly sensitive situations, consider printing to a PDF from a clean, unregistered PDF viewer after removing metadata. This creates a new PDF where the creator application is the PDF printer rather than your primary authoring software.

Frequently asked questions

Does metadata removal affect the document's appearance?

No. Metadata is stored separately from the visible content. Removing it does not change any text, images, layout, fonts, or any other element visible when you open the file. The document looks and works identically after cleaning.

Can I add custom metadata to a PDF?

Yes. If you want to add your own metadata — for example, to tag documents with project codes, keywords, or version numbers for an internal system — iKeepPDF's PDF Info tool shows the existing fields. A dedicated PDF editor would let you write custom properties.

Does removing metadata make a PDF completely anonymous?

It removes the structured metadata fields, but it cannot make a PDF completely unattributable. The content itself (writing style, unusual formatting choices, specific phrasing) can still link a document to its author through non-technical means. Metadata removal is a necessary but not sufficient step for full anonymity.

Are scanned PDFs less likely to contain personal metadata?

Scanned PDFs contain less metadata than software-generated PDFs because no authoring application is involved. However, modern scanner software and scanning apps still write author information, creation timestamps, and device model data. It is worth checking even scanned documents.

Does emailing a PDF strip its metadata?

No. Email clients do not modify PDF attachments. The file arrives at the recipient with all its metadata intact, exactly as you sent it. You must clean the metadata before attaching the file to an email.