diff --git a/README.md b/README.md index 556bed1..a0206dc 100644 --- a/README.md +++ b/README.md @@ -7,8 +7,11 @@ > Validate the signal, protect the evidence, route only what each destination is > authorized to receive, and prove every external action through an immutable ledger. -Defensive cyber-threat-intelligence routing & evidence-sealing platform. -Built as a 48h hackathon project on 2026-05-13. Active development. +Defensive cyber-threat-intelligence routing & evidence-sealing platform — a +small-worker mesh that ingests public threat feeds, classifies and seals cases, +routes them to the right destinations under TLP policy, and proves every action +through an append-only ledger. Started as a 48h hackathon (2026-05); grown into +a working platform with a fine-tuned model in operation. --- @@ -16,25 +19,25 @@ Built as a 48h hackathon project on 2026-05-13. Active development. ```text Sensors -→ Scoutline fetch, parse, dedup, signal -→ Proofline validate indicators, score confidence -→ Mapline resolve victim, actor, jurisdiction, CERT route -→ Classifyline severity, TLP, incident type, internal class -→ Sealine authority-sealed evidence encryption -→ Routeline pick destinations, build payloads, submit -→ Ledgerline immutable audit, receipts, outcomes -→ Publishline sanitized public intelligence after mitigation -→ Trainline lawful intel → LoRA-ready training data -→ Cockpit operator UI (FastAPI + Jinja) +→ Scoutline fetch + parse public feeds, emit normalized cases [built] +→ Proofline validate indicators, score confidence [planned] +→ Mapline resolve hosting country / jurisdiction [built] +→ Classifyline severity, TLP, incident type, internal class [built] +→ Sealine authority-sealed evidence encryption [built] +→ Routeline pick destinations under policy, build payloads [built] +→ Courier submit to destinations, collect receipts [built] +→ Ledgerline immutable audit of every submission + blocked route [built] +→ Publishline sanitized public intelligence after mitigation [planned] +→ Trainline lawful intel → LoRA datasets + QLoRA training [built] +→ Cockpit operator UI (FastAPI + Jinja) [built] ``` -Each `-line` is a stage in a small-worker mesh; each worker performs one -narrow job and passes a normalized `Case` object to the next stage. Heavy -models are reserved for judgment-heavy tasks. Humans approve everything -sensitive before it leaves the platform. +Each `-line` is a stage in a small-worker mesh; each worker does one narrow job +and passes a normalized `Case` object onward. Rules drive the deterministic +work; a fine-tuned model handles judgment (see Training). Humans approve +anything sensitive before it leaves the platform. -Full architecture: [`docs/dossier.md`](docs/dossier.md) — consolidated read of -the original individual records (still in [`docs/archive/`](docs/archive/)). +Full design: [`docs/dossier.md`](docs/dossier.md) · style: [`docs/style.md`](docs/style.md) · demo run-sheet: [`docs/demo.md`](docs/demo.md) --- @@ -44,127 +47,136 @@ the original individual records (still in [`docs/archive/`](docs/archive/)). python3 -m virtualenv .venv .venv/bin/pip install -e . -.venv/bin/psyc init # create the sqlite db -.venv/bin/psyc fetch-all # ingest URLhaus + CISA KEV + Feodo Tracker -.venv/bin/psyc serve --port 8767 # cockpit at http://127.0.0.1:8767 -.venv/bin/psyc status # count of ingested cases +.venv/bin/psyc init # create the sqlite db +.venv/bin/psyc fetch-all # ingest URLhaus + CISA KEV + Feodo Tracker +.venv/bin/psyc demo # run one case through the whole pipeline ``` +The platform runs as up to three services (each in its own terminal): + +```bash +.venv/bin/psyc serve --port 8767 # operator cockpit → http://127.0.0.1:8767 +.venv/bin/psyc mock-cert --port 8770 # stand-in CERT / abuse-API receiver + +# optional, needs an NVIDIA GPU — puts the live model behind the Classifier bot: +docker run --gpus all --rm -p 8771:8771 --entrypoint python \ + -v $(pwd)/data:/data -v $(pwd)/scripts:/scripts \ + psyc-trainer /scripts/serve_model.py --adapter /data/adapters/psyc-v4/final +``` + +--- + +## Cockpit + +`http://127.0.0.1:8767` — five views: + +| View | Path | Shows | +|---|---|---| +| Case Queue | `/cases` | every ingested case, severity + TLP badges | +| Case detail | `/cases/{id}` | classification, observables, sealed package, routes, per-case ledger | +| Worker Mesh | `/cases/{id}/journey` | animated 7-bot replay of the case's path; the Classifier bot shows the live model's verdict | +| Ledger | `/ledger` | immutable audit feed | +| Trainline | `/train` | datasets + trained adapters with loss charts | + --- ## Code layout ``` src/psyc/ - models.py # normalized Case object (Pydantic) - db.py # SQLAlchemy Core; cases + ledger tables - result.py # Ok / Err / Result[T, E] - log.py # structlog configuration - cli.py # flat Typer commands - lines/ # one file per worker line - scout.py # Fetcher + Signalizer (URLhaus today) - cockpit/ # FastAPI + Jinja operator UI - app.py - templates/ - static/ + models.py normalized Case object + enums (Pydantic) + db.py SQLAlchemy Core — cases + ledger tables + result.py Ok / Err / Result[T, E] + log.py structlog configuration + cli.py flat Typer CLI + mock_cert.py stand-in CERT / abuse-API receiver + lines/ one file per worker line + scout.py multi-source fetch + signalize (URLhaus, CISA KEV, Feodo) + classify.py severity / TLP / incident type / internal class + map.py GeoResolver — host IP → country + seal.py PyNaCl sealed-box evidence encryption + route.py destination matrix + policy gates + courier.py HTTP submission + payload building + ledger.py append-only audit + train.py JSONL dataset builders + quality gate + cockpit/ FastAPI + Jinja operator UI + app.py routes + journey.py Worker Mesh / case-journey assembly + inference.py client for the live model server + templates/ static/ + +scripts/ + train_qlora.py unsloth QLoRA fine-tune + eval_adapter.py adapter evaluation + serve_model.py inference server (FastAPI, runs in the CUDA container) docs/ - dossier.md # full architecture (consolidated) - style.md # 12-fold Python style guide - archive/ # original architecture docs + logo variants + dossier.md style.md demo.md archive/ ``` --- +## Training & the live model (Trainline + QLoRA) + +`psyc train-build-all` emits Alpaca-style JSONL datasets under +`data/datasets/-v.jsonl` for four defensive tasks — `ioc_extraction`, +`severity_classification`, `routing_decision`, `tlp_assignment`. QualityGate +drops TLP:RED, restricted-source, empty, and credential-leak rows. + +Fine-tune Qwen3.5-4B with QLoRA in the CUDA container: + +```bash +docker build -t psyc-trainer -f Dockerfile.train . + +docker run --gpus all --rm --entrypoint python \ + -v $(pwd)/data:/data -v $(pwd)/scripts:/scripts \ + psyc-trainer /scripts/train_qlora.py \ + --dataset /data/datasets/ioc_extraction-v4.jsonl \ + --dataset /data/datasets/severity_classification-v4.jsonl \ + --dataset /data/datasets/routing_decision-v4.jsonl \ + --dataset /data/datasets/tlp_assignment-v4.jsonl \ + --output /data/adapters/psyc-v4 +``` + +Defaults target a 24 GB GPU (3090/4090): `unsloth/Qwen3.5-4B` at 4-bit, LoRA +r=16, bf16, 3 epochs. Output: `data/adapters//final/` + `training_meta.json`. +Evaluate with `scripts/eval_adapter.py`; the `/train` cockpit page shows every +dataset and adapter with its loss curve. + +`scripts/serve_model.py` loads an adapter and serves `/infer` over HTTP. When +it's running, the cockpit's **Classifier bot** shows the live model's severity +verdict beside the rule's — and degrades to rules-only if the server is down. + +--- + ## Style -All code follows [`docs/style.md`](docs/style.md): `Optional[X]` / `List[X]` -from `typing`, `Field(default_factory=...)` for Pydantic mutables, `Result[T, E]` -types for expected failures (`raise` reserved for true exceptions), `class X(str, Enum)` -for closed string sets, structlog with `area.action` event names, SQLAlchemy Core -(no ORM), flat Typer commands with hyphenated names. Ruff config in `pyproject.toml` -enforces the bits a linter can check; `UP006`/`UP007`/`UP035` are disabled so the -typing-import rules stand. +All code follows [`docs/style.md`](docs/style.md) — a 12-fold guide: `Optional[X]` +/ `List[X]` from `typing`, `Field(default_factory=...)`, `Result[T, E]` for +expected failures, `class X(str, Enum)`, structlog `area.action` events, +SQLAlchemy Core (no ORM), flat hyphenated Typer commands. --- ## Scope **Lawful, white-hat defensive operations only.** psyc routes intelligence to -victims, CERT/CSIRTs, sector ISACs, provider/registrar abuse desks, and -trusted CTI communities. It will **not**: - -- amplify stolen data -- expose victims prematurely -- interact with criminal actors -- distribute exploitation content -- submit evidence that exceeds a destination's max TLP - -The boundaries are defined in `docs/dossier.md` §5 *Destination Minimization*, -§10 *TLP Enforcement*, and §16 *Public Reporting Rules*. The Ledger records -every external submission and destructive action; sensitive evidence is -encrypted to authorized recipients via Sealine before any routing decision. +victims, CERT/CSIRTs, sector ISACs, provider/registrar abuse desks, and trusted +CTI communities. It will **not** amplify stolen data, expose victims +prematurely, interact with criminal actors, distribute exploitation content, or +submit evidence beyond a destination's max TLP. Boundaries: `docs/dossier.md` +§5, §10, §16. --- -## Training (Trainline + QLoRA) - -`psyc train-build-all` emits Alpaca-style JSONL datasets under -`data/datasets/-v.jsonl` for four defensive tasks: `ioc_extraction`, -`severity_classification`, `routing_decision`, `tlp_assignment`. QualityGate -drops `TLP:RED`, restricted sources, empty/oversize, and credential-leak rows -per the dossier's training-data policy. - -To fine-tune Qwen3.5-4B with QLoRA in an NVIDIA Docker container: - -```bash -# 1. build datasets (one-off; re-run after ingestion changes) -.venv/bin/psyc train-build-all - -# 2. build the training image (pytorch 2.6/CUDA 12.4 base + unsloth + Qwen3.5) -docker build -t psyc-trainer -f Dockerfile.train . - -# 3. fine-tune — scripts/ + data/ are mounted, so script edits need no rebuild -docker run --gpus all --rm --entrypoint python \ - -v $(pwd)/data:/data -v $(pwd)/scripts:/scripts \ - psyc-trainer /scripts/train_qlora.py \ - --dataset /data/datasets/ioc_extraction-v2.jsonl \ - --dataset /data/datasets/severity_classification-v2.jsonl \ - --dataset /data/datasets/routing_decision-v2.jsonl \ - --dataset /data/datasets/tlp_assignment-v2.jsonl \ - --output /data/adapters/psyc-v2 -``` - -Defaults target a 24 GB consumer GPU (3090/4090): `unsloth/Qwen3.5-4B` at 4-bit, -LoRA `r=16`/`alpha=16`, bf16, 3 epochs, effective batch size 8. For A100-40/80 -bump `--base-model unsloth/Qwen3.5-9B` and raise `--batch-size` + -`--max-seq-length`. - -Output: `data/adapters/psyc-v1/final/` (adapter weights) + `training_meta.json` -(base model, hyperparameters, dataset list). - -Evaluate the adapter against held-out dataset rows: - -```bash -docker run --gpus all --rm \ - --entrypoint python \ - -v $(pwd)/data:/data -v $(pwd)/scripts:/scripts \ - psyc-trainer /scripts/eval_adapter.py \ - --adapter /data/adapters/psyc-v2/final \ - --dataset /data/datasets/ioc_extraction-v2.jsonl --n 5 -``` - -The cockpit `/train` page lists every built dataset and trained adapter with -its base model, hyperparameters, dataset provenance, and a per-step loss chart. - ## Status -Day 2 of a 48h build. Shipped: Scoutline (URLhaus) → Classifyline → Mapline -(GeoResolver via ip-api.com) → Sealine (PyNaCl sealed boxes) → Routeline → -Courier → mock CERT → Ledgerline. Cockpit has cases / case detail / ledger -pages and a design-token CSS layer. Trainline emits LoRA-ready JSONL; -`Dockerfile.train` builds an unsloth + Qwen3.5 QLoRA training container. +Working platform. Built: Scoutline (URLhaus + CISA KEV + Feodo Tracker) → +Classifyline → Mapline → Sealine → Routeline → Courier → Ledgerline → Trainline, +the FastAPI cockpit (five views incl. the animated Worker Mesh), and a +fine-tuned Qwen3.5-4B (psyc-v4) served live behind the Classifier bot. +Not yet built: Proofline (confidence scoring), Publishline (public advisories). ## License -Unset for the hackathon. Choose before any external release. +Unset. Choose before any external release. diff --git a/docs/demo.md b/docs/demo.md new file mode 100644 index 0000000..b8e5e66 --- /dev/null +++ b/docs/demo.md @@ -0,0 +1,65 @@ +# psyc — demo run-sheet + +A ~5-minute walk-through of the platform; ~10 min including setup. + +## 0. Setup (once) + +```bash +python3 -m virtualenv .venv +.venv/bin/pip install -e . +.venv/bin/psyc init +``` + +## 1. Start the services + +Separate terminals — the third is optional and needs an NVIDIA GPU: + +```bash +# terminal 1 — operator cockpit +.venv/bin/psyc serve --port 8767 + +# terminal 2 — stand-in CERT / abuse-API receiver +.venv/bin/psyc mock-cert --port 8770 + +# terminal 3 — live model behind the Classifier bot (optional) +docker run --gpus all --rm -p 8771:8771 --entrypoint python \ + -v $(pwd)/data:/data -v $(pwd)/scripts:/scripts \ + psyc-trainer /scripts/serve_model.py --adapter /data/adapters/psyc-v4/final +``` + +## 2. Run the pipeline + +```bash +.venv/bin/psyc fetch-all # ingest URLhaus + CISA KEV + Feodo Tracker +.venv/bin/psyc demo # one case end-to-end; prints the cockpit links +``` + +## 3. The walk-through + +1. **Case Queue** — http://127.0.0.1:8767/cases + 30+ cases across three feeds, with severity + TLP badges. *"Three sources, + one normalized case object."* + +2. **Worker Mesh** — open the journey link `psyc demo` printed. This is the + centerpiece: seven robot agents, a case token flowing through, each bot + waking to perform its action and speak its real answer. Hit **▶ replay**. + - **Classifier bot** carries a live verdict from the fine-tuned psyc-v4 + model — green when the model agrees with the rule, amber when it differs. + - **Sealer** — evidence encrypted to authority public keys (PyNaCl sealed box). + - **Router** — destinations cleared vs. policy-blocked (TLP ceiling, country). + +3. **Ledger** — http://127.0.0.1:8767/ledger + Every submission and every blocked route, immutably recorded. + +4. **Trainline** — http://127.0.0.1:8767/train + The four task datasets and the trained adapters with their loss curves. + +## Talking points + +- **Defensive only** — psyc never amplifies stolen data or contacts criminal + actors; routing is gated by TLP, jurisdiction, and incident type. +- **Rules + model** — deterministic work is rule-based; the fine-tuned model + handles judgment. One bot is genuinely a live model, not animation over rules. +- **Honest about limits** — psyc-v4 evals 7/8 on severity; the one miss is a + documented data-scarcity case (one online-botnet example), not a bug, and was + not gamed away. diff --git a/src/psyc/cli.py b/src/psyc/cli.py index 899724e..d978e6c 100644 --- a/src/psyc/cli.py +++ b/src/psyc/cli.py @@ -8,6 +8,7 @@ import typer import uvicorn from psyc import db, log +from psyc.cockpit import inference from psyc.lines import classify, courier, route, scout, seal, train from psyc.lines import map as map_line from psyc.models import Outcome @@ -315,8 +316,15 @@ def demo() -> None: for b in blocked: typer.echo(f" ⊘ {b.destination_name}: {b.reason}") typer.echo("") - typer.echo(f"inspect: http://127.0.0.1:8767/cases/{case.case_id}") - typer.echo(f"ledger: http://127.0.0.1:8767/ledger") + typer.echo("── see it in the cockpit ──") + typer.echo(f" Worker Mesh: http://127.0.0.1:8767/cases/{case.case_id}/journey") + typer.echo(f" Case detail: http://127.0.0.1:8767/cases/{case.case_id}") + typer.echo(f" Ledger: http://127.0.0.1:8767/ledger") + adapter = inference.server_adapter() + if adapter: + typer.echo(f" Live model: up ({adapter}) — the Classifier bot shows its verdict") + else: + typer.echo(" Live model: inference server offline — Classifier bot falls back to rules") @app.command("serve")