# docker_nsrl Offline **known-file hash filter** for DFIR triage — "is this file part of a known software distribution, or is it unusual and worth a look?" Backed by **[CIRCL hashlookup](https://www.circl.lu/services/hashlookup/)**: at build time the image downloads CIRCL's `hashlookup-full.bloom` (a SHA-1 Bloom filter covering NIST NSRL **plus** many other known-good sources) and queries it entirely offline. This replaces the original image, which shipped a self-built **MD5** bloom frozen at NSRL RDS 2.72 (March 2021). Published as `tabledevil/nsrl`. Built and refreshed by [cert_docker_bot](https://git.ktf.ninja/tabledevil/cert_docker_bot) on a monthly cadence. > **Hashes are SHA-1.** The old image took MD5; the CIRCL dataset is SHA-1. ## Usage ### Look up individual hashes Prints `+:` for known (in the set) and `-:` for unknown: ```bash docker run --rm tabledevil/nsrl da39a3ee5e6b4b0d3255bfef95601890afd80709 # +:da39a3ee5e6b4b0d3255bfef95601890afd80709 ``` ### From stdin (pipe a hash list) `-s` reads stdin; combine with `-0` (suppress known hits) to print only the **unknown** hashes worth investigating, or `-1` to print only known ones: ```bash sha1sum /evidence/* | awk '{print $1}' \ | docker run --rm -i tabledevil/nsrl -s -0 ``` `-v` switches to verbose `hash:True|False` output and prints the bloom's source/date header on stderr. ### Analyse a whole directory tree Runs CIRCL's `hashlookup-forensic-analyser` over a mounted target, hashing every file and emitting CSV (`hashlookup_result,filename,sha1,size`): ```bash docker run --rm -v /evidence:/data:ro tabledevil/nsrl analyse -d /data ``` Pass any extra `hashlookup-analyser.py` flags after `analyse`. ## What's in the image | path | purpose | |------|---------| | `/nsrl/hashlookup-full.bloom` | the SHA-1 Bloom filter (~1 GB), the data payload | | `/nsrl/bloom.info` | source URL + upstream `Last-Modified` of the bloom | | `/nsrl/search.py` | single-hash / stdin lookup (Flor bloom reader) | | `/opt/hfa/` | hashlookup-forensic-analyser (directory mode) | | `/entrypoint.sh` | dispatches `analyse …` vs hash lookup | Image size is ~2.4 GB (the bloom dominates). ## Caveats - **Bloom filters answer "probably yes" / "definitely no."** A `+` match has a small false-positive probability by design; a `-` is authoritative. Treat a hit as "known-good with high confidence," not proof. - **Upstream freshness.** As of this writing CIRCL's `hashlookup-full.bloom` has not changed since **Oct 2023** (the live API likewise reports `nsrl-version: 2023.09.2`). The monthly rebuild re-fetches the same file until CIRCL republishes — fine for "standard OS/app file?" triage, but it is not a bleeding-edge dataset. If you need current data, query the online API at `https://hashlookup.circl.lu/lookup/sha1/` instead. ## Building ```bash docker build -t tabledevil/nsrl . # downloads the ~1 GB bloom at build time ```