- Created new Dockerfile.remnux based on remnux/remnux-distro:latest - Added comprehensive tool testing suite (test-tools.sh, test-containers.sh) - Tool comparison analysis shows we get all original tools plus additional ones from REMnux: * Additional PDF tools: qpdf, pdfresurrect, pdftool, base64dump, tesseract * All original tools preserved: pdfid.py, pdf-parser.py, peepdf, origami, capa, box-js, visidata, unfurl - Updated README.md with new usage instructions - Updated WARP.md documentation - All 21 tools tested and verified working - Migration maintains full functionality while adding REMnux capabilities
106 lines
3.9 KiB
Markdown
106 lines
3.9 KiB
Markdown
# WARP.md
|
|
|
|
This file provides guidance to WARP (warp.dev) when working with code in this repository.
|
|
|
|
## Project Overview
|
|
|
|
This repository contains a Docker-based file analysis toolkit, primarily focused on PDF and malware analysis. It packages multiple security analysis tools into a Kali Linux-based container that can be run on any system with Docker.
|
|
|
|
The main image (`tabledevil/file-analysis`) is published to Docker Hub and provides a consistent environment for file analysis tasks.
|
|
|
|
## Core Architecture
|
|
|
|
- **Base Image**: Kali Linux rolling release
|
|
- **Primary Use Case**: Analyzing potentially malicious files (PDFs, Office docs, executables)
|
|
- **Execution Model**: Container runs with mounted host directory (`/data`) for file access
|
|
- **User Security**: Runs as non-privileged `nonroot` user (UID 1001) for security isolation
|
|
|
|
## Development Commands
|
|
|
|
### Building the Container
|
|
```bash
|
|
docker build -t tabledevil/file-analysis .
|
|
```
|
|
|
|
### Running the Container
|
|
```bash
|
|
# Standard usage - mounts current directory
|
|
docker run -it --rm -v "$(pwd):/data" tabledevil/file-analysis
|
|
|
|
# Run specific command without interactive shell
|
|
docker run --rm -v "$(pwd):/data" tabledevil/file-analysis pdfid.py suspicious.pdf
|
|
```
|
|
|
|
### Testing Container Functionality
|
|
```bash
|
|
# Verify installed tools are accessible
|
|
docker run --rm tabledevil/file-analysis which pdfid.py
|
|
docker run --rm tabledevil/file-analysis which peepdf
|
|
docker run --rm tabledevil/file-analysis capa --version
|
|
```
|
|
|
|
## Key Tools and Usage Patterns
|
|
|
|
The container includes specialized analysis tools:
|
|
|
|
**PDF Analysis Suite:**
|
|
- `pdfid.py` - Quick PDF structure overview
|
|
- `pdf-parser.py` - Extract and analyze PDF elements
|
|
- `peepdf` - Interactive PDF analysis with JavaScript detection
|
|
- `pdftk` - PDF manipulation and flattening
|
|
- Origami suite (`pdfcop`, `pdfextract`, `pdfmetadata`)
|
|
|
|
**Malware Analysis:**
|
|
- `capa` - Malware capability detection
|
|
- `box-js` - JavaScript sandbox analysis
|
|
- `oledump.py`, `rtfdump.py`, `emldump.py` - Office document analysis
|
|
- `visidata` - Data exploration and analysis
|
|
|
|
**File Format Tools:**
|
|
- `exiftool` - Metadata extraction
|
|
- `catdoc`, `docx2txt` - Document conversion
|
|
- `unrtf` - RTF processing
|
|
- ImageMagick - Image processing (PDF policy modified for read/write)
|
|
|
|
## Environment Configuration
|
|
|
|
- **Timezone**: Europe/Berlin
|
|
- **Python**: Uses `--break-system-packages` for pip installations due to Kali base
|
|
- **PATH**: Extended to include `/opt/didierstevenssuite/` and pypy binaries
|
|
- **Working Directory**: `/data` (expected mount point)
|
|
|
|
## Development Guidelines
|
|
|
|
### Docker Best Practices Applied
|
|
- Multi-stage approach with dependency installation
|
|
- Non-root user execution
|
|
- Minimal layer count optimization
|
|
- Proper cleanup of package caches
|
|
|
|
### Tool Integration
|
|
- Didier Stevens suite tools are cloned from GitHub and made executable
|
|
- Python tools installed via both system pip and pipx for isolation
|
|
- Ruby gems (Origami) installed system-wide
|
|
- npm packages installed globally for JavaScript analysis
|
|
|
|
### Security Considerations
|
|
- Container runs as unprivileged user
|
|
- ImageMagick PDF policy relaxed only for necessary operations
|
|
- File analysis happens in isolated container environment
|
|
|
|
## File Structure
|
|
|
|
- `Dockerfile` - Main container build configuration
|
|
- `files/README` - German language tool documentation for container users
|
|
- `files/command_help` - Detailed usage examples for PDF analysis tools
|
|
- `pip.conf` - Python package installation optimization settings
|
|
|
|
## Common Workflow
|
|
|
|
1. Place suspicious files in a directory
|
|
2. Run container with that directory mounted to `/data`
|
|
3. Use appropriate analysis tools based on file type
|
|
4. Extract results and artifacts to the mounted directory
|
|
5. Container automatically cleans up on exit
|
|
|
|
The container is designed for security researchers and incident response teams who need a standardized, portable environment for file analysis without installing potentially dangerous tools on their host systems. |