# pdf-parser.py
> Parse PDF structure, locate objects, extract content, and search for strings

**Category:** [[categories/analyze-documents-pdf|Analyze Documents > PDF]] | **Tier:** Rich (FOR610) | **Author:** Didier Stevens
**Docs:** [https://docs.remnux.org/discover-the-tools/analyze+documents/pdf](https://docs.remnux.org/discover-the-tools/analyze+documents/pdf)

## Usage
```bash
pdf-parser.py document.pdf -a
pdf-parser.py document.pdf -s /URI
pdf-parser.py document.pdf -k /URI
pdf-parser.py document.pdf -o 6 -d object6.jpg
```

## Recipes
- [[recipes/pdf-object-extraction|Extract Embedded Object from PDF]]
- [[recipes/pdf-javascript-extraction|Extract JavaScript from PDF]]

## Workflows
- [[workflows/document-analysis-workflow|Malicious Document Analysis]] — Step 2: Structure Analysis
- [[workflows/shellcode-analysis-workflow|Shellcode Analysis]] — Step 2: Extraction

## Related Tools
- [[tools/origamindee|origamindee]] — Parse, modify, generate PDF files.
- [[tools/pdfid|pdfid.py]] — Scan PDF files for suspicious keywords like /JavaScript, /Op
- [[tools/pdfresurrect|pdfresurrect]] — Extract and analyze previous versions from PDF files
- [[tools/pdftk|pdftk]] — Manipulate PDF files — merge, split, flatten, encrypt, and e
- [[tools/pdftool|pdftool.py]] — Analyze PDF incremental updates

## FOR610
**Labs:** 3.1
**Sections:** 1, 3

#pdf #static-analysis #object-extraction #didier-stevens