Expand iplib, iptype, and ioc plugins with better caching, throttling, and lookup logic. Update validation script and showcase journal accordingly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
186 lines
7.9 KiB
Markdown
186 lines
7.9 KiB
Markdown
# VisiData Config + Plugins
|
||
|
||
This folder contains a VisiData `config.py` (symlinked from `visidatarc`) plus a small set of local plugins under `plugins/`.
|
||
|
||
## Install
|
||
|
||
The installer links (or copies) the config and plugins into VisiData’s per-user directory.
|
||
|
||
```bash
|
||
./install.sh --link # default, symlinks into place
|
||
./install.sh --copy # copies into place
|
||
./install.sh --deps # installs optional Python deps into $VD_DIR/plugins-deps
|
||
```
|
||
|
||
On VisiData 3.3, `$VD_DIR` defaults to:
|
||
- macOS: `~/Library/Preferences/visidata`
|
||
- Linux: `${XDG_CONFIG_HOME:-~/.config}/visidata`
|
||
|
||
## Plugins
|
||
|
||
Plugins are installed into `$VD_DIR/plugins/` and imported via the top-level `plugins` package.
|
||
|
||
## Showcase Demo
|
||
|
||
This repo includes a self-contained sample dataset + command log to demonstrate the local IOC/IP features:
|
||
|
||
- `showcase_ioc.tsv` (sample IOC rows)
|
||
- `showcase_ioc.vdj` (replay file)
|
||
|
||
Run it interactively from this repo root:
|
||
|
||
```bash
|
||
vd --visidata-dir "$PWD" --config "$PWD/visidatarc" --play showcase_ioc.vdj
|
||
```
|
||
|
||
What it showcases:
|
||
- custom types: `IP`, `Domain`, `URL`, `Hash`
|
||
- IP membership expressions: `src_ip * network`
|
||
- IP network fields: `src_ip.type`, `src_ip.mask`, `src_ip.range`, `src_ip.broadcast`, `src_ip.identity`, `src_ip.hostcount`, `src_ip.rfc_type`
|
||
- URL parsing fields: `url.host`, `url.parts.path`, `url.domain`
|
||
- hash classification: `file_hash.kind`
|
||
- IP lookups: `src_ip.ipinfo.*`, `src_ip.asn.*`, `src_ip.geo.*`, `src_ip.country()`
|
||
- provider visibility: `src_ip.geo.source`, `src_ip.asn.source`, `domain.dns.source`
|
||
- domain/network intel: `domain.dns.*`, `domain.rdap.*`, `domain.resolveip`, `domain.resolveips`, `domain.resolveipv4`, `domain.resolveipv6`
|
||
- hash intel: `file_hash.mb.*` (MalwareBazaar)
|
||
- VirusTotal lookups: `src_ip.vt.*`, `file_hash.vt.*`, `domain.vt.*`, `url.vt.*` (plus `hash.vt.name`, `hash.vt.names`, `hash.vt.score`, `domain.vt.ip`, `domain.vt.ips`)
|
||
- local plugin command: `tke-hidecol`
|
||
|
||
Lookup notes:
|
||
- VT columns require `options.tke_vt_api_key` (or `VT_API_KEY` / `VIRUSTOTAL_API_KEY` / `~/.virustotal_api_key`).
|
||
- IPInfo/ASN/Geo columns use free providers and may be rate-limited; `options.tke_ipinfo_token` improves reliability.
|
||
- To keep replays practical with strict throttling, some heavy lookup columns are intentionally limited to a subset of rows.
|
||
|
||
### `plugins/hidecol.py`
|
||
|
||
Adds a command to hide columns that are empty or constant across all rows.
|
||
|
||
- Command: `tke-hidecol`
|
||
- Menu: `Column -> Hide -> empty/superfluous columns`
|
||
|
||
### `plugins/iptype.py`
|
||
|
||
Adds a custom IP datatype that supports:
|
||
- IPv4 + IPv6 addresses
|
||
- CIDR networks (e.g. `192.168.7.0/24`)
|
||
- Correct sorting (numeric, by version)
|
||
- Membership test operator: `ip * net` (and `net * ip`)
|
||
- Normalized lookup/enrichment properties, accessible as attributes in expressions
|
||
|
||
#### Type + Command
|
||
|
||
- Type converter: `ip(...)`
|
||
- Type name: `IP`
|
||
- Command: `type-ip` (sets `cursorCol.type=ip`)
|
||
|
||
#### Operations
|
||
|
||
Membership test:
|
||
- `ipcol * "192.168.7.0/24"` -> `True`/`False`
|
||
- `"192.168.7.0/24" * ipcol` -> `True`/`False`
|
||
|
||
#### Attributes (on `IP` typed cells)
|
||
|
||
Lookup objects expose both normalized fields and raw response data:
|
||
|
||
- `ipcol.type` (`ipv4`/`ipv6`/`cidr4`/`cidr6`), `ipcol.family`, `ipcol.is_cidr`
|
||
- `ipcol.mask`, `ipcol.netmask`, `ipcol.identity`, `ipcol.broadcast`, `ipcol.range`, `ipcol.hostcount`, `ipcol.address_count`
|
||
- `ipcol.rfc_type` (classification: e.g. `global`, `private`, `documentation`, `shared`, `link-local`, ...)
|
||
- `ipcol.ipinfo.country`
|
||
- `ipcol.ipinfo.data.<any_json_field>`
|
||
- `ipcol.asn.asn`, `ipcol.asn.name`, `ipcol.asn.country`
|
||
- `ipcol.asn.data.<any_json_field>`
|
||
- `ipcol.vt.verdict` (e.g. `"3/94"`), `ipcol.vt.score`, `ipcol.vt.malicious`, `ipcol.vt.total`, `ipcol.vt.category` (alias: `ipcol.vt.type`)
|
||
- `ipcol.vt.data.<any_json_field>`
|
||
- `ipcol.geo.*` (best-available geo: prefers MaxMind mmdb, else free HTTP providers)
|
||
- `ipcol.maxmind.*` (offline-only MaxMind lookup; empty if no mmdb)
|
||
|
||
Type shortcuts on table-like sheets:
|
||
- `;i` -> `type-ip`
|
||
- `;d` -> `type-domain`
|
||
- `;u` -> `type-url-ioc`
|
||
- `;h` -> `type-hash`
|
||
|
||
#### Caching
|
||
|
||
All lookup providers cache results in a local sqlite+pickle DB (default `~/.visidata_cache.db`).
|
||
|
||
#### Lookup Providers + Keys
|
||
|
||
Options (set in `config.py` / `visidatarc`):
|
||
- `options.tke_cache_db_path="~/.visidata_cache.db"`
|
||
- `options.tke_lookup_cache_ttl=86400`
|
||
- `options.tke_lookup_error_ttl=300`
|
||
- `options.tke_lookup_timeout=10`
|
||
- `options.tke_ipinfo_token="..."` (optional; ipinfo can work without it)
|
||
- `options.tke_ipapi_key="..."` (optional)
|
||
- `options.tke_vt_api_key="..."` (required for VT lookups unless using `~/.virustotal_api_key`)
|
||
- `options.tke_maxmind_mmdb_path="/path/to/GeoLite2-City.mmdb"` (optional)
|
||
|
||
Env var equivalents:
|
||
- `IPINFO_TOKEN`, `IPAPI_KEY`
|
||
- `VT_API_KEY` or `VIRUSTOTAL_API_KEY` (also supports `~/.virustotal_api_key`)
|
||
- `MAXMIND_MMDB_PATH` or `GEOIP_MMDB_PATH`
|
||
|
||
MaxMind (offline “free” GeoLite2) support:
|
||
- Place a `GeoLite2-City.mmdb` / `GeoLite2-Country.mmdb` file in `$VD_DIR/`, or set `options.tke_maxmind_mmdb_path`.
|
||
|
||
### `plugins/iplib.py`
|
||
|
||
Pure-Python library used by `iptype.py` for:
|
||
- Normalized info classes (`IPInfo`, `ASNInfo`, `VTInfo`, `GeoInfo`)
|
||
- `JSONNode` wrapper (`.data.<field>`) for safe attribute-style access into raw dict/list JSON
|
||
- Parsing/normalization helpers for each provider’s response shape
|
||
|
||
This file intentionally does **not** import VisiData so it can be validated outside the VisiData runtime.
|
||
|
||
### VT schema (`*.vt`)
|
||
|
||
`ip.vt`, `domain.vt`, `url.vt`, and `hash.vt` expose a normalized shape for quick querying across free + premium responses, while still preserving full raw JSON:
|
||
|
||
Common fields:
|
||
- `verdict` (`"malicious/total"`)
|
||
- `score` / `confidence` (`malicious/total` float)
|
||
- `malicious`, `suspicious`, `harmless`, `undetected`, `timeout`, `total`
|
||
- `category` / `categories`
|
||
- `reputation`, `votes_harmless`, `votes_malicious`
|
||
- `tags`, `last_analysis_date`, `last_modification_date`
|
||
- `results` (normalized engine results map), `stats`, `data` (full raw API response)
|
||
|
||
Object-specific conveniences:
|
||
- `ip.vt`: `asn`, `as_owner`, `country`, `continent`, `network`
|
||
- `domain.vt`: `ip` (best/last known), `ips` (all extracted A/AAAA)
|
||
- `url.vt`: URL-level verdict/score plus direct raw access via `url.vt.attrs.*`
|
||
- `hash.vt`: `name` (best malware name), `names` (all extracted names), plus verdict/score
|
||
|
||
Raw passthrough:
|
||
- Any VT `attributes` field is also available via `obj.vt.<attribute_name>` and `obj.vt.attrs.<attribute_name>`.
|
||
|
||
## Config: `visidatarc`
|
||
|
||
This repo’s `visidatarc` is intended to be installed as VisiData’s `config.py`:
|
||
- `$VD_DIR/config.py` (VisiData 3.3 default)
|
||
- and also `~/.visidatarc` as a legacy fallback
|
||
|
||
It currently contains:
|
||
- display/date format options
|
||
- a sqlite+pickle caching decorator and a set of general-purpose helpers (aggregators, timestamp parsing, “dirty” JSON parsing, etc)
|
||
|
||
### Are the `visidatarc` functions superseded?
|
||
|
||
Partially:
|
||
- The **IP-centric lookups and normalized attribute access** are now primarily handled by `plugins/iptype.py` on typed values (e.g. `ipcol.geo.country_code`).
|
||
- Many other helpers in `visidatarc` (aggregators like `avgdiff`, parsing/time conversion helpers, etc.) are still independent and useful.
|
||
|
||
### Keeping old + new side-by-side (without duplicating code)
|
||
|
||
Yes. The cleanest pattern in VisiData is:
|
||
1. Put shared logic into a module under `plugins/` (so it’s on `sys.path` via `$VD_DIR`).
|
||
2. In `visidatarc`, import and expose thin wrappers (or just import the module and use `module.func(...)` in expressions).
|
||
|
||
Concretely:
|
||
- `plugins/iplib.py` already holds parsing/normalization shared by the IP type.
|
||
- If you have legacy functions in `visidatarc` that overlap with the new IP lookups, refactor those functions into a shared module (e.g. `plugins/lookups.py`) and have both `visidatarc` and `plugins/iptype.py` call into it.
|
||
|
||
This keeps backward-compatible names available while ensuring caching/auth/provider behavior is implemented in one place.
|