Migrate from Kali to REMnux base image
- Created new Dockerfile.remnux based on remnux/remnux-distro:latest - Added comprehensive tool testing suite (test-tools.sh, test-containers.sh) - Tool comparison analysis shows we get all original tools plus additional ones from REMnux: * Additional PDF tools: qpdf, pdfresurrect, pdftool, base64dump, tesseract * All original tools preserved: pdfid.py, pdf-parser.py, peepdf, origami, capa, box-js, visidata, unfurl - Updated README.md with new usage instructions - Updated WARP.md documentation - All 21 tools tested and verified working - Migration maintains full functionality while adding REMnux capabilities
This commit is contained in:
@@ -0,0 +1,106 @@
|
||||
# WARP.md
|
||||
|
||||
This file provides guidance to WARP (warp.dev) when working with code in this repository.
|
||||
|
||||
## Project Overview
|
||||
|
||||
This repository contains a Docker-based file analysis toolkit, primarily focused on PDF and malware analysis. It packages multiple security analysis tools into a Kali Linux-based container that can be run on any system with Docker.
|
||||
|
||||
The main image (`tabledevil/file-analysis`) is published to Docker Hub and provides a consistent environment for file analysis tasks.
|
||||
|
||||
## Core Architecture
|
||||
|
||||
- **Base Image**: Kali Linux rolling release
|
||||
- **Primary Use Case**: Analyzing potentially malicious files (PDFs, Office docs, executables)
|
||||
- **Execution Model**: Container runs with mounted host directory (`/data`) for file access
|
||||
- **User Security**: Runs as non-privileged `nonroot` user (UID 1001) for security isolation
|
||||
|
||||
## Development Commands
|
||||
|
||||
### Building the Container
|
||||
```bash
|
||||
docker build -t tabledevil/file-analysis .
|
||||
```
|
||||
|
||||
### Running the Container
|
||||
```bash
|
||||
# Standard usage - mounts current directory
|
||||
docker run -it --rm -v "$(pwd):/data" tabledevil/file-analysis
|
||||
|
||||
# Run specific command without interactive shell
|
||||
docker run --rm -v "$(pwd):/data" tabledevil/file-analysis pdfid.py suspicious.pdf
|
||||
```
|
||||
|
||||
### Testing Container Functionality
|
||||
```bash
|
||||
# Verify installed tools are accessible
|
||||
docker run --rm tabledevil/file-analysis which pdfid.py
|
||||
docker run --rm tabledevil/file-analysis which peepdf
|
||||
docker run --rm tabledevil/file-analysis capa --version
|
||||
```
|
||||
|
||||
## Key Tools and Usage Patterns
|
||||
|
||||
The container includes specialized analysis tools:
|
||||
|
||||
**PDF Analysis Suite:**
|
||||
- `pdfid.py` - Quick PDF structure overview
|
||||
- `pdf-parser.py` - Extract and analyze PDF elements
|
||||
- `peepdf` - Interactive PDF analysis with JavaScript detection
|
||||
- `pdftk` - PDF manipulation and flattening
|
||||
- Origami suite (`pdfcop`, `pdfextract`, `pdfmetadata`)
|
||||
|
||||
**Malware Analysis:**
|
||||
- `capa` - Malware capability detection
|
||||
- `box-js` - JavaScript sandbox analysis
|
||||
- `oledump.py`, `rtfdump.py`, `emldump.py` - Office document analysis
|
||||
- `visidata` - Data exploration and analysis
|
||||
|
||||
**File Format Tools:**
|
||||
- `exiftool` - Metadata extraction
|
||||
- `catdoc`, `docx2txt` - Document conversion
|
||||
- `unrtf` - RTF processing
|
||||
- ImageMagick - Image processing (PDF policy modified for read/write)
|
||||
|
||||
## Environment Configuration
|
||||
|
||||
- **Timezone**: Europe/Berlin
|
||||
- **Python**: Uses `--break-system-packages` for pip installations due to Kali base
|
||||
- **PATH**: Extended to include `/opt/didierstevenssuite/` and pypy binaries
|
||||
- **Working Directory**: `/data` (expected mount point)
|
||||
|
||||
## Development Guidelines
|
||||
|
||||
### Docker Best Practices Applied
|
||||
- Multi-stage approach with dependency installation
|
||||
- Non-root user execution
|
||||
- Minimal layer count optimization
|
||||
- Proper cleanup of package caches
|
||||
|
||||
### Tool Integration
|
||||
- Didier Stevens suite tools are cloned from GitHub and made executable
|
||||
- Python tools installed via both system pip and pipx for isolation
|
||||
- Ruby gems (Origami) installed system-wide
|
||||
- npm packages installed globally for JavaScript analysis
|
||||
|
||||
### Security Considerations
|
||||
- Container runs as unprivileged user
|
||||
- ImageMagick PDF policy relaxed only for necessary operations
|
||||
- File analysis happens in isolated container environment
|
||||
|
||||
## File Structure
|
||||
|
||||
- `Dockerfile` - Main container build configuration
|
||||
- `files/README` - German language tool documentation for container users
|
||||
- `files/command_help` - Detailed usage examples for PDF analysis tools
|
||||
- `pip.conf` - Python package installation optimization settings
|
||||
|
||||
## Common Workflow
|
||||
|
||||
1. Place suspicious files in a directory
|
||||
2. Run container with that directory mounted to `/data`
|
||||
3. Use appropriate analysis tools based on file type
|
||||
4. Extract results and artifacts to the mounted directory
|
||||
5. Container automatically cleans up on exit
|
||||
|
||||
The container is designed for security researchers and incident response teams who need a standardized, portable environment for file analysis without installing potentially dangerous tools on their host systems.
|
||||
Reference in New Issue
Block a user