Files
tobias e62a14dafc Add markdown wiki with 473 pages and zk browser
Generate interlinked wiki from master inventory: 397 tool pages,
15 workflow pages, 27 recipe pages, 33 category pages, plus index.
All pages use [[wiki-links]] for cross-navigation between tools,
workflows, recipes, and categories (1782 links total).

Install zk for interactive browsing with fzf search, tag filtering,
and backlink discovery. Add 'fhelp wiki' command and Makefile target.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 19:50:36 +01:00

2.8 KiB

Malicious Document Analysis

Analyze suspicious documents (PDF, Office, RTF, OneNote) for embedded malware, macros, and exploits. Follows Zeltser's 6-step methodology.

FOR610 Labs: 3.1, 3.3, 3.4, 3.5

Steps

Step 1: Format Identification

Tools: tools/file, tools/trid

Identify true format: OLE2 (legacy Office), OOXML (modern Office), RTF, PDF, OneNote. Don't trust the file extension — use magic bytes.

file specimen.exe
trid document.doc

Step 2: Structure Analysis

Tools: tools/oledump-py, tools/rtfdump-py, tools/pdfid-py, tools/pdf-parser-py, tools/onedump-py

Parse document internals. For Office: oledump.py to list streams (M = macro). For PDF: pdfid.py for risky keywords (/JavaScript, /OpenAction). For RTF: rtfdump.py for hex-heavy groups.

oledump.py document.docm
rtfdump.py document.rtf
pdfid.py document.pdf

Step 3: Password Handling (if encrypted)

Tools: tools/msoffcrypto-tool

If document is password-protected: msoffcrypto-tool -p . Common passwords: infected, malware, password, 123456.

msoffcrypto-tool -p infected <encrypted.docx> <decrypted.docx>

Step 4: Macro/Script Extraction

Tools: tools/oledump-py, tools/olevba, tools/pcode2code, tools/xlmmacrodeobfuscator

Extract VBA: oledump.py -s -v. For p-code: pcode2code. For Excel 4.0 macros: XLMMacroDeobfuscator. Check olevba for auto-execute triggers (AutoOpen, Document_Open).

oledump.py document.docm
olevba document.docm
pcode2code <document.docm>

Step 5: Payload Decoding

Tools: tools/base64dump-py, tools/translate-py, tools/gunzip, tools/numbers-to-string-py, tools/cyberchef

Decode embedded payloads. Common chains: Base64 → gunzip → XOR. Use CyberChef for visual multi-step decoding. translate.py for byte-level transforms (byte ^ key).

base64dump.py file.txt
translate.py "byte ^ 35" < input.bin > output.bin
gunzip -c compressed.gz > output.bin

Step 6: Embedded Object Analysis

Tools: tools/scdbgc, tools/xorsearch, tools/yara, tools/1768-py

If shellcode found: emulate with scdbgc. Scan for known patterns (YARA). Check for Cobalt Strike beacons (1768.py). Route PE payloads to Static Analysis Workflow.

scdbgc /f shellcode.bin /s -1
XORSearch -W -d 3 file.bin
yara-rules specimen.bin

Step 7: Document IOCs

Record: embedded URLs, downloaded payload hashes, C2 addresses, macro behavior (what APIs called), exploit type (CVE if applicable).

#documents #office #pdf #rtf #macro #onenote #workflow