Files
tabledevil eb211f38f4 Add README (CIRCL hashlookup usage + caveats)
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 14:14:23 +02:00

2.9 KiB

docker_nsrl

Offline known-file hash filter for DFIR triage — "is this file part of a known software distribution, or is it unusual and worth a look?"

Backed by CIRCL hashlookup: at build time the image downloads CIRCL's hashlookup-full.bloom (a SHA-1 Bloom filter covering NIST NSRL plus many other known-good sources) and queries it entirely offline. This replaces the original image, which shipped a self-built MD5 bloom frozen at NSRL RDS 2.72 (March 2021).

Published as tabledevil/nsrl. Built and refreshed by cert_docker_bot on a monthly cadence.

Hashes are SHA-1. The old image took MD5; the CIRCL dataset is SHA-1.

Usage

Look up individual hashes

Prints +: for known (in the set) and -: for unknown:

docker run --rm tabledevil/nsrl da39a3ee5e6b4b0d3255bfef95601890afd80709
# +:da39a3ee5e6b4b0d3255bfef95601890afd80709

From stdin (pipe a hash list)

-s reads stdin; combine with -0 (suppress known hits) to print only the unknown hashes worth investigating, or -1 to print only known ones:

sha1sum /evidence/* | awk '{print $1}' \
  | docker run --rm -i tabledevil/nsrl -s -0

-v switches to verbose hash:True|False output and prints the bloom's source/date header on stderr.

Analyse a whole directory tree

Runs CIRCL's hashlookup-forensic-analyser over a mounted target, hashing every file and emitting CSV (hashlookup_result,filename,sha1,size):

docker run --rm -v /evidence:/data:ro tabledevil/nsrl analyse -d /data

Pass any extra hashlookup-analyser.py flags after analyse.

What's in the image

path purpose
/nsrl/hashlookup-full.bloom the SHA-1 Bloom filter (~1 GB), the data payload
/nsrl/bloom.info source URL + upstream Last-Modified of the bloom
/nsrl/search.py single-hash / stdin lookup (Flor bloom reader)
/opt/hfa/ hashlookup-forensic-analyser (directory mode)
/entrypoint.sh dispatches analyse … vs hash lookup

Image size is ~2.4 GB (the bloom dominates).

Caveats

  • Bloom filters answer "probably yes" / "definitely no." A + match has a small false-positive probability by design; a - is authoritative. Treat a hit as "known-good with high confidence," not proof.
  • Upstream freshness. As of this writing CIRCL's hashlookup-full.bloom has not changed since Oct 2023 (the live API likewise reports nsrl-version: 2023.09.2). The monthly rebuild re-fetches the same file until CIRCL republishes — fine for "standard OS/app file?" triage, but it is not a bleeding-edge dataset. If you need current data, query the online API at https://hashlookup.circl.lu/lookup/sha1/<hash> instead.

Building

docker build -t tabledevil/nsrl .   # downloads the ~1 GB bloom at build time