- some good tips by angelalbertini:
- Generic advice to make PDF manipulations by hand (fill forms, modify contents...):
- run
qpdf -qdf
to normalize the PDF [optionally with --stream-data=uncompress]. - do your changes.
- clean up the PDF with
mutool clean
. - remember that some fields like author can be stored as Unicode: so 'ange' = 00 61 00 6E 00 67 00 65
- inline kerning is often used (by LaTeX, etc...) in the pages contents: it's an array of a short string (w/ parenthesis), a number, a short string...
- so 'ange' can be stored like: [(an) 3.0 (ge)]
- run
- PDF Cheat Sheets: Cheat sheets for the Portable Document Format
- Generic advice to make PDF manipulations by hand (fill forms, modify contents...):
- PDFObject: An open-source standards-friendly JavaScript utility for embedding PDF files into HTML documents.
- PDF on Forensics Wiki
- PDF Tools by Didier Stevens
- How to Embed JavaScript into PDF
- PJScana command-line utility that uses a learning algorithm to detect PDF files with JavaScript-related malware - sourceforge repo.
- RUPS is an abbreviation for Reading and Updating PDF Syntax. RUPS is a tool built on top of iText® that allows you to look inside a PDF document and browse the different PDF objects and content streams.
- CLI pdf viewer for linux - stackoverflow
- PeePDF: Powerful Python tool to analyze PDF documents
- pdfstreamdumper: research tool for the analysis of malicious pdf documents. make sure to run the installer first to get all of the 3rd party dlls installed correctly.
- pdfextract: A tool and library that can extract various areas of text from a PDF, especially a scholarly article PDF.
- 4Discovery Tools