Add FOR610 tool/workflow knowledge base and data pipeline
Build comprehensive malware analysis knowledge base from 3 sources: - SANS FOR610 course: 120 tools, 47 labs, 15 workflows, 27 recipes - REMnux salt-states: 340 packages parsed from GitHub - REMnux docs: 280+ tools scraped from docs.remnux.org Master inventory merges all sources into 447 tools with help tiers (rich/standard/basic). Pipeline generates: tools.db (397 entries), 397 cheatsheets with multi-tool recipes, 15 workflow guides, 224 TLDR pages, and coverage reports. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,62 @@
|
||||
# FOR610 Knowledge Base
|
||||
|
||||
Structured data extracted from the SANS FOR610 (Reverse-Engineering Malware) course materials.
|
||||
|
||||
## Files
|
||||
|
||||
| File | Description |
|
||||
|------|-------------|
|
||||
| `categories.yaml` | Tool category taxonomy (18 categories) |
|
||||
| `tools.yaml` | Master tool catalog (~110 tools with metadata) |
|
||||
| `labs.yaml` | All 47 labs with ordered tool sequences |
|
||||
| `workflows.yaml` | 8 high-level analysis workflow patterns |
|
||||
|
||||
## Schema
|
||||
|
||||
### tools.yaml
|
||||
|
||||
Each tool entry contains:
|
||||
|
||||
- `id` — unique kebab-case identifier (used for cross-references)
|
||||
- `name` — display name as typed on CLI
|
||||
- `aliases` — alternative names
|
||||
- `description` — one-line description
|
||||
- `category` — FK to categories.yaml
|
||||
- `platform` — `linux` | `windows` | `both` | `online`
|
||||
- `in_remnux` — boolean, available in REMnux container
|
||||
- `labs` — list of lab IDs that use this tool
|
||||
- `typical_usage` — 1-3 command examples
|
||||
- `for610_sections` — which course sections cover this tool
|
||||
- `tags` — free-form search tags
|
||||
|
||||
### labs.yaml
|
||||
|
||||
Each lab entry contains:
|
||||
|
||||
- `id` — lab number (e.g., "3.1")
|
||||
- `section` — course section (1-5)
|
||||
- `title` — full lab title
|
||||
- `sample` — malware specimen analyzed
|
||||
- `analysis_type` — controlled vocabulary
|
||||
- `tools_used` — **ordered** list with `tool_id`, `platform`, and `purpose`
|
||||
- `key_techniques` — techniques demonstrated
|
||||
- `prerequisite_labs` — dependencies (optional)
|
||||
- `tags` — free-form search tags
|
||||
|
||||
### workflows.yaml
|
||||
|
||||
Each workflow contains ordered steps with tool references and related labs.
|
||||
|
||||
## Generating JSON
|
||||
|
||||
```bash
|
||||
make generate-data
|
||||
```
|
||||
|
||||
This converts all YAML files to JSON under `data/generated/` using `yq`.
|
||||
|
||||
## Cross-Reference Integrity
|
||||
|
||||
Tool IDs in `labs.yaml` → `tools_used[].tool_id` must exist in `tools.yaml`.
|
||||
Lab IDs in `tools.yaml` → `labs[]` must exist in `labs.yaml`.
|
||||
Category IDs in `tools.yaml` → `category` must exist in `categories.yaml`.
|
||||
Reference in New Issue
Block a user