Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2.9 KiB
docker_nsrl
Offline known-file hash filter for DFIR triage — "is this file part of a known software distribution, or is it unusual and worth a look?"
Backed by CIRCL hashlookup:
at build time the image downloads CIRCL's hashlookup-full.bloom (a SHA-1
Bloom filter covering NIST NSRL plus many other known-good sources) and
queries it entirely offline. This replaces the original image, which shipped a
self-built MD5 bloom frozen at NSRL RDS 2.72 (March 2021).
Published as tabledevil/nsrl. Built and refreshed by
cert_docker_bot on a
monthly cadence.
Hashes are SHA-1. The old image took MD5; the CIRCL dataset is SHA-1.
Usage
Look up individual hashes
Prints +: for known (in the set) and -: for unknown:
docker run --rm tabledevil/nsrl da39a3ee5e6b4b0d3255bfef95601890afd80709
# +:da39a3ee5e6b4b0d3255bfef95601890afd80709
From stdin (pipe a hash list)
-s reads stdin; combine with -0 (suppress known hits) to print only the
unknown hashes worth investigating, or -1 to print only known ones:
sha1sum /evidence/* | awk '{print $1}' \
| docker run --rm -i tabledevil/nsrl -s -0
-v switches to verbose hash:True|False output and prints the bloom's
source/date header on stderr.
Analyse a whole directory tree
Runs CIRCL's hashlookup-forensic-analyser over a mounted target, hashing
every file and emitting CSV (hashlookup_result,filename,sha1,size):
docker run --rm -v /evidence:/data:ro tabledevil/nsrl analyse -d /data
Pass any extra hashlookup-analyser.py flags after analyse.
What's in the image
| path | purpose |
|---|---|
/nsrl/hashlookup-full.bloom |
the SHA-1 Bloom filter (~1 GB), the data payload |
/nsrl/bloom.info |
source URL + upstream Last-Modified of the bloom |
/nsrl/search.py |
single-hash / stdin lookup (Flor bloom reader) |
/opt/hfa/ |
hashlookup-forensic-analyser (directory mode) |
/entrypoint.sh |
dispatches analyse … vs hash lookup |
Image size is ~2.4 GB (the bloom dominates).
Caveats
- Bloom filters answer "probably yes" / "definitely no." A
+match has a small false-positive probability by design; a-is authoritative. Treat a hit as "known-good with high confidence," not proof. - Upstream freshness. As of this writing CIRCL's
hashlookup-full.bloomhas not changed since Oct 2023 (the live API likewise reportsnsrl-version: 2023.09.2). The monthly rebuild re-fetches the same file until CIRCL republishes — fine for "standard OS/app file?" triage, but it is not a bleeding-edge dataset. If you need current data, query the online API athttps://hashlookup.circl.lu/lookup/sha1/<hash>instead.
Building
docker build -t tabledevil/nsrl . # downloads the ~1 GB bloom at build time