Add FOR610 tool/workflow knowledge base and data pipeline
Build comprehensive malware analysis knowledge base from 3 sources: - SANS FOR610 course: 120 tools, 47 labs, 15 workflows, 27 recipes - REMnux salt-states: 340 packages parsed from GitHub - REMnux docs: 280+ tools scraped from docs.remnux.org Master inventory merges all sources into 447 tools with help tiers (rich/standard/basic). Pipeline generates: tools.db (397 entries), 397 cheatsheets with multi-tool recipes, 15 workflow guides, 224 TLDR pages, and coverage reports. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,84 @@
|
||||
============================================================
|
||||
Malicious Document Analysis
|
||||
============================================================
|
||||
|
||||
Analyze suspicious documents (PDF, Office, RTF, OneNote) for embedded malware, macros, and exploits. Follows Zeltser's 6-step methodology.
|
||||
|
||||
Related FOR610 Labs: 3.1, 3.3, 3.4, 3.5
|
||||
|
||||
────────────────────────────────────────────────────────────
|
||||
|
||||
Step 1: Format Identification
|
||||
Tools: file, trid
|
||||
Identify true format: OLE2 (legacy Office), OOXML
|
||||
(modern Office), RTF, PDF, OneNote. Don't trust the
|
||||
file extension — use magic bytes.
|
||||
|
||||
$ file specimen.exe
|
||||
$ trid document.doc
|
||||
|
||||
Step 2: Structure Analysis
|
||||
Tools: oledump-py, rtfdump-py, pdfid-py, pdf-parser-py, onedump-py
|
||||
Parse document internals. For Office: oledump.py to
|
||||
list streams (M = macro). For PDF: pdfid.py for risky
|
||||
keywords (/JavaScript, /OpenAction). For RTF:
|
||||
rtfdump.py for hex-heavy groups.
|
||||
|
||||
$ oledump.py document.docm
|
||||
$ rtfdump.py document.rtf
|
||||
$ pdfid.py document.pdf
|
||||
$ pdf-parser.py document.pdf -a
|
||||
|
||||
Step 3: Password Handling (if encrypted)
|
||||
Tools: msoffcrypto-tool
|
||||
If document is password-protected: msoffcrypto-tool -p
|
||||
<password> <input> <output>. Common passwords:
|
||||
infected, malware, password, 123456.
|
||||
|
||||
$ msoffcrypto-tool -p infected <encrypted.docx> <decrypted.docx>
|
||||
|
||||
Step 4: Macro/Script Extraction
|
||||
Tools: oledump-py, olevba, pcode2code, XLMMacroDeobfuscator
|
||||
Extract VBA: oledump.py -s <stream> -v. For p-code:
|
||||
pcode2code. For Excel 4.0 macros:
|
||||
XLMMacroDeobfuscator. Check olevba for auto-execute
|
||||
triggers (AutoOpen, Document_Open).
|
||||
|
||||
$ oledump.py document.docm
|
||||
$ olevba document.docm
|
||||
$ pcode2code <document.docm>
|
||||
$ xlmdeobfuscator --file <spreadsheet.xlsm>
|
||||
|
||||
Step 5: Payload Decoding
|
||||
Tools: base64dump-py, translate-py, gunzip, numbers-to-string-py, cyberchef
|
||||
Decode embedded payloads. Common chains: Base64 →
|
||||
gunzip → XOR. Use CyberChef for visual multi-step
|
||||
decoding. translate.py for byte-level transforms (byte
|
||||
^ key).
|
||||
|
||||
$ base64dump.py file.txt
|
||||
$ translate.py "byte ^ 35" < input.bin > output.bin
|
||||
$ gunzip -c compressed.gz > output.bin
|
||||
$ oledump.py doc.docm -s A3 -v | numbers-to-string.py -j
|
||||
$ cyberchef
|
||||
|
||||
Step 6: Embedded Object Analysis
|
||||
Tools: scdbgc, xorsearch, yara, 1768-py
|
||||
If shellcode found: emulate with scdbgc. Scan for
|
||||
known patterns (YARA). Check for Cobalt Strike beacons
|
||||
(1768.py). Route PE payloads to Static Analysis
|
||||
Workflow.
|
||||
|
||||
$ scdbgc /f shellcode.bin /s -1
|
||||
$ XORSearch -W -d 3 file.bin
|
||||
$ yara-rules specimen.bin
|
||||
$ 1768.py shellcode.bin
|
||||
|
||||
Step 7: Document IOCs
|
||||
Record: embedded URLs, downloaded payload hashes, C2
|
||||
addresses, macro behavior (what APIs called), exploit
|
||||
type (CVE if applicable).
|
||||
|
||||
────────────────────────────────────────────────────────────
|
||||
Tip: 'fhelp cheat <tool>' for full examples
|
||||
'Ctrl+G' for interactive cheatsheet browser
|
||||
Reference in New Issue
Block a user