stage-7: demo polish — mesh-aware demo command, current README, run-sheet
psyc demo now closes with cockpit links pointing at the Worker Mesh and reports whether the live model server is up. README rewritten to current state — Worker Mesh, inference server, model-in-operation, the three services, accurate code layout. Adds docs/demo.md, a one-page run-sheet. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
236
README.md
236
README.md
@@ -7,8 +7,11 @@
|
||||
> Validate the signal, protect the evidence, route only what each destination is
|
||||
> authorized to receive, and prove every external action through an immutable ledger.
|
||||
|
||||
Defensive cyber-threat-intelligence routing & evidence-sealing platform.
|
||||
Built as a 48h hackathon project on 2026-05-13. Active development.
|
||||
Defensive cyber-threat-intelligence routing & evidence-sealing platform — a
|
||||
small-worker mesh that ingests public threat feeds, classifies and seals cases,
|
||||
routes them to the right destinations under TLP policy, and proves every action
|
||||
through an append-only ledger. Started as a 48h hackathon (2026-05); grown into
|
||||
a working platform with a fine-tuned model in operation.
|
||||
|
||||
---
|
||||
|
||||
@@ -16,25 +19,25 @@ Built as a 48h hackathon project on 2026-05-13. Active development.
|
||||
|
||||
```text
|
||||
Sensors
|
||||
→ Scoutline fetch, parse, dedup, signal
|
||||
→ Proofline validate indicators, score confidence
|
||||
→ Mapline resolve victim, actor, jurisdiction, CERT route
|
||||
→ Classifyline severity, TLP, incident type, internal class
|
||||
→ Sealine authority-sealed evidence encryption
|
||||
→ Routeline pick destinations, build payloads, submit
|
||||
→ Ledgerline immutable audit, receipts, outcomes
|
||||
→ Publishline sanitized public intelligence after mitigation
|
||||
→ Trainline lawful intel → LoRA-ready training data
|
||||
→ Cockpit operator UI (FastAPI + Jinja)
|
||||
→ Scoutline fetch + parse public feeds, emit normalized cases [built]
|
||||
→ Proofline validate indicators, score confidence [planned]
|
||||
→ Mapline resolve hosting country / jurisdiction [built]
|
||||
→ Classifyline severity, TLP, incident type, internal class [built]
|
||||
→ Sealine authority-sealed evidence encryption [built]
|
||||
→ Routeline pick destinations under policy, build payloads [built]
|
||||
→ Courier submit to destinations, collect receipts [built]
|
||||
→ Ledgerline immutable audit of every submission + blocked route [built]
|
||||
→ Publishline sanitized public intelligence after mitigation [planned]
|
||||
→ Trainline lawful intel → LoRA datasets + QLoRA training [built]
|
||||
→ Cockpit operator UI (FastAPI + Jinja) [built]
|
||||
```
|
||||
|
||||
Each `-line` is a stage in a small-worker mesh; each worker performs one
|
||||
narrow job and passes a normalized `Case` object to the next stage. Heavy
|
||||
models are reserved for judgment-heavy tasks. Humans approve everything
|
||||
sensitive before it leaves the platform.
|
||||
Each `-line` is a stage in a small-worker mesh; each worker does one narrow job
|
||||
and passes a normalized `Case` object onward. Rules drive the deterministic
|
||||
work; a fine-tuned model handles judgment (see Training). Humans approve
|
||||
anything sensitive before it leaves the platform.
|
||||
|
||||
Full architecture: [`docs/dossier.md`](docs/dossier.md) — consolidated read of
|
||||
the original individual records (still in [`docs/archive/`](docs/archive/)).
|
||||
Full design: [`docs/dossier.md`](docs/dossier.md) · style: [`docs/style.md`](docs/style.md) · demo run-sheet: [`docs/demo.md`](docs/demo.md)
|
||||
|
||||
---
|
||||
|
||||
@@ -44,127 +47,136 @@ the original individual records (still in [`docs/archive/`](docs/archive/)).
|
||||
python3 -m virtualenv .venv
|
||||
.venv/bin/pip install -e .
|
||||
|
||||
.venv/bin/psyc init # create the sqlite db
|
||||
.venv/bin/psyc fetch-all # ingest URLhaus + CISA KEV + Feodo Tracker
|
||||
.venv/bin/psyc serve --port 8767 # cockpit at http://127.0.0.1:8767
|
||||
.venv/bin/psyc status # count of ingested cases
|
||||
.venv/bin/psyc init # create the sqlite db
|
||||
.venv/bin/psyc fetch-all # ingest URLhaus + CISA KEV + Feodo Tracker
|
||||
.venv/bin/psyc demo # run one case through the whole pipeline
|
||||
```
|
||||
|
||||
The platform runs as up to three services (each in its own terminal):
|
||||
|
||||
```bash
|
||||
.venv/bin/psyc serve --port 8767 # operator cockpit → http://127.0.0.1:8767
|
||||
.venv/bin/psyc mock-cert --port 8770 # stand-in CERT / abuse-API receiver
|
||||
|
||||
# optional, needs an NVIDIA GPU — puts the live model behind the Classifier bot:
|
||||
docker run --gpus all --rm -p 8771:8771 --entrypoint python \
|
||||
-v $(pwd)/data:/data -v $(pwd)/scripts:/scripts \
|
||||
psyc-trainer /scripts/serve_model.py --adapter /data/adapters/psyc-v4/final
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Cockpit
|
||||
|
||||
`http://127.0.0.1:8767` — five views:
|
||||
|
||||
| View | Path | Shows |
|
||||
|---|---|---|
|
||||
| Case Queue | `/cases` | every ingested case, severity + TLP badges |
|
||||
| Case detail | `/cases/{id}` | classification, observables, sealed package, routes, per-case ledger |
|
||||
| Worker Mesh | `/cases/{id}/journey` | animated 7-bot replay of the case's path; the Classifier bot shows the live model's verdict |
|
||||
| Ledger | `/ledger` | immutable audit feed |
|
||||
| Trainline | `/train` | datasets + trained adapters with loss charts |
|
||||
|
||||
---
|
||||
|
||||
## Code layout
|
||||
|
||||
```
|
||||
src/psyc/
|
||||
models.py # normalized Case object (Pydantic)
|
||||
db.py # SQLAlchemy Core; cases + ledger tables
|
||||
result.py # Ok / Err / Result[T, E]
|
||||
log.py # structlog configuration
|
||||
cli.py # flat Typer commands
|
||||
lines/ # one file per worker line
|
||||
scout.py # Fetcher + Signalizer (URLhaus today)
|
||||
cockpit/ # FastAPI + Jinja operator UI
|
||||
app.py
|
||||
templates/
|
||||
static/
|
||||
models.py normalized Case object + enums (Pydantic)
|
||||
db.py SQLAlchemy Core — cases + ledger tables
|
||||
result.py Ok / Err / Result[T, E]
|
||||
log.py structlog configuration
|
||||
cli.py flat Typer CLI
|
||||
mock_cert.py stand-in CERT / abuse-API receiver
|
||||
lines/ one file per worker line
|
||||
scout.py multi-source fetch + signalize (URLhaus, CISA KEV, Feodo)
|
||||
classify.py severity / TLP / incident type / internal class
|
||||
map.py GeoResolver — host IP → country
|
||||
seal.py PyNaCl sealed-box evidence encryption
|
||||
route.py destination matrix + policy gates
|
||||
courier.py HTTP submission + payload building
|
||||
ledger.py append-only audit
|
||||
train.py JSONL dataset builders + quality gate
|
||||
cockpit/ FastAPI + Jinja operator UI
|
||||
app.py routes
|
||||
journey.py Worker Mesh / case-journey assembly
|
||||
inference.py client for the live model server
|
||||
templates/ static/
|
||||
|
||||
scripts/
|
||||
train_qlora.py unsloth QLoRA fine-tune
|
||||
eval_adapter.py adapter evaluation
|
||||
serve_model.py inference server (FastAPI, runs in the CUDA container)
|
||||
|
||||
docs/
|
||||
dossier.md # full architecture (consolidated)
|
||||
style.md # 12-fold Python style guide
|
||||
archive/ # original architecture docs + logo variants
|
||||
dossier.md style.md demo.md archive/
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Training & the live model (Trainline + QLoRA)
|
||||
|
||||
`psyc train-build-all` emits Alpaca-style JSONL datasets under
|
||||
`data/datasets/<task>-v<n>.jsonl` for four defensive tasks — `ioc_extraction`,
|
||||
`severity_classification`, `routing_decision`, `tlp_assignment`. QualityGate
|
||||
drops TLP:RED, restricted-source, empty, and credential-leak rows.
|
||||
|
||||
Fine-tune Qwen3.5-4B with QLoRA in the CUDA container:
|
||||
|
||||
```bash
|
||||
docker build -t psyc-trainer -f Dockerfile.train .
|
||||
|
||||
docker run --gpus all --rm --entrypoint python \
|
||||
-v $(pwd)/data:/data -v $(pwd)/scripts:/scripts \
|
||||
psyc-trainer /scripts/train_qlora.py \
|
||||
--dataset /data/datasets/ioc_extraction-v4.jsonl \
|
||||
--dataset /data/datasets/severity_classification-v4.jsonl \
|
||||
--dataset /data/datasets/routing_decision-v4.jsonl \
|
||||
--dataset /data/datasets/tlp_assignment-v4.jsonl \
|
||||
--output /data/adapters/psyc-v4
|
||||
```
|
||||
|
||||
Defaults target a 24 GB GPU (3090/4090): `unsloth/Qwen3.5-4B` at 4-bit, LoRA
|
||||
r=16, bf16, 3 epochs. Output: `data/adapters/<run>/final/` + `training_meta.json`.
|
||||
Evaluate with `scripts/eval_adapter.py`; the `/train` cockpit page shows every
|
||||
dataset and adapter with its loss curve.
|
||||
|
||||
`scripts/serve_model.py` loads an adapter and serves `/infer` over HTTP. When
|
||||
it's running, the cockpit's **Classifier bot** shows the live model's severity
|
||||
verdict beside the rule's — and degrades to rules-only if the server is down.
|
||||
|
||||
---
|
||||
|
||||
## Style
|
||||
|
||||
All code follows [`docs/style.md`](docs/style.md): `Optional[X]` / `List[X]`
|
||||
from `typing`, `Field(default_factory=...)` for Pydantic mutables, `Result[T, E]`
|
||||
types for expected failures (`raise` reserved for true exceptions), `class X(str, Enum)`
|
||||
for closed string sets, structlog with `area.action` event names, SQLAlchemy Core
|
||||
(no ORM), flat Typer commands with hyphenated names. Ruff config in `pyproject.toml`
|
||||
enforces the bits a linter can check; `UP006`/`UP007`/`UP035` are disabled so the
|
||||
typing-import rules stand.
|
||||
All code follows [`docs/style.md`](docs/style.md) — a 12-fold guide: `Optional[X]`
|
||||
/ `List[X]` from `typing`, `Field(default_factory=...)`, `Result[T, E]` for
|
||||
expected failures, `class X(str, Enum)`, structlog `area.action` events,
|
||||
SQLAlchemy Core (no ORM), flat hyphenated Typer commands.
|
||||
|
||||
---
|
||||
|
||||
## Scope
|
||||
|
||||
**Lawful, white-hat defensive operations only.** psyc routes intelligence to
|
||||
victims, CERT/CSIRTs, sector ISACs, provider/registrar abuse desks, and
|
||||
trusted CTI communities. It will **not**:
|
||||
|
||||
- amplify stolen data
|
||||
- expose victims prematurely
|
||||
- interact with criminal actors
|
||||
- distribute exploitation content
|
||||
- submit evidence that exceeds a destination's max TLP
|
||||
|
||||
The boundaries are defined in `docs/dossier.md` §5 *Destination Minimization*,
|
||||
§10 *TLP Enforcement*, and §16 *Public Reporting Rules*. The Ledger records
|
||||
every external submission and destructive action; sensitive evidence is
|
||||
encrypted to authorized recipients via Sealine before any routing decision.
|
||||
victims, CERT/CSIRTs, sector ISACs, provider/registrar abuse desks, and trusted
|
||||
CTI communities. It will **not** amplify stolen data, expose victims
|
||||
prematurely, interact with criminal actors, distribute exploitation content, or
|
||||
submit evidence beyond a destination's max TLP. Boundaries: `docs/dossier.md`
|
||||
§5, §10, §16.
|
||||
|
||||
---
|
||||
|
||||
## Training (Trainline + QLoRA)
|
||||
|
||||
`psyc train-build-all` emits Alpaca-style JSONL datasets under
|
||||
`data/datasets/<task>-v<n>.jsonl` for four defensive tasks: `ioc_extraction`,
|
||||
`severity_classification`, `routing_decision`, `tlp_assignment`. QualityGate
|
||||
drops `TLP:RED`, restricted sources, empty/oversize, and credential-leak rows
|
||||
per the dossier's training-data policy.
|
||||
|
||||
To fine-tune Qwen3.5-4B with QLoRA in an NVIDIA Docker container:
|
||||
|
||||
```bash
|
||||
# 1. build datasets (one-off; re-run after ingestion changes)
|
||||
.venv/bin/psyc train-build-all
|
||||
|
||||
# 2. build the training image (pytorch 2.6/CUDA 12.4 base + unsloth + Qwen3.5)
|
||||
docker build -t psyc-trainer -f Dockerfile.train .
|
||||
|
||||
# 3. fine-tune — scripts/ + data/ are mounted, so script edits need no rebuild
|
||||
docker run --gpus all --rm --entrypoint python \
|
||||
-v $(pwd)/data:/data -v $(pwd)/scripts:/scripts \
|
||||
psyc-trainer /scripts/train_qlora.py \
|
||||
--dataset /data/datasets/ioc_extraction-v2.jsonl \
|
||||
--dataset /data/datasets/severity_classification-v2.jsonl \
|
||||
--dataset /data/datasets/routing_decision-v2.jsonl \
|
||||
--dataset /data/datasets/tlp_assignment-v2.jsonl \
|
||||
--output /data/adapters/psyc-v2
|
||||
```
|
||||
|
||||
Defaults target a 24 GB consumer GPU (3090/4090): `unsloth/Qwen3.5-4B` at 4-bit,
|
||||
LoRA `r=16`/`alpha=16`, bf16, 3 epochs, effective batch size 8. For A100-40/80
|
||||
bump `--base-model unsloth/Qwen3.5-9B` and raise `--batch-size` +
|
||||
`--max-seq-length`.
|
||||
|
||||
Output: `data/adapters/psyc-v1/final/` (adapter weights) + `training_meta.json`
|
||||
(base model, hyperparameters, dataset list).
|
||||
|
||||
Evaluate the adapter against held-out dataset rows:
|
||||
|
||||
```bash
|
||||
docker run --gpus all --rm \
|
||||
--entrypoint python \
|
||||
-v $(pwd)/data:/data -v $(pwd)/scripts:/scripts \
|
||||
psyc-trainer /scripts/eval_adapter.py \
|
||||
--adapter /data/adapters/psyc-v2/final \
|
||||
--dataset /data/datasets/ioc_extraction-v2.jsonl --n 5
|
||||
```
|
||||
|
||||
The cockpit `/train` page lists every built dataset and trained adapter with
|
||||
its base model, hyperparameters, dataset provenance, and a per-step loss chart.
|
||||
|
||||
## Status
|
||||
|
||||
Day 2 of a 48h build. Shipped: Scoutline (URLhaus) → Classifyline → Mapline
|
||||
(GeoResolver via ip-api.com) → Sealine (PyNaCl sealed boxes) → Routeline →
|
||||
Courier → mock CERT → Ledgerline. Cockpit has cases / case detail / ledger
|
||||
pages and a design-token CSS layer. Trainline emits LoRA-ready JSONL;
|
||||
`Dockerfile.train` builds an unsloth + Qwen3.5 QLoRA training container.
|
||||
Working platform. Built: Scoutline (URLhaus + CISA KEV + Feodo Tracker) →
|
||||
Classifyline → Mapline → Sealine → Routeline → Courier → Ledgerline → Trainline,
|
||||
the FastAPI cockpit (five views incl. the animated Worker Mesh), and a
|
||||
fine-tuned Qwen3.5-4B (psyc-v4) served live behind the Classifier bot.
|
||||
Not yet built: Proofline (confidence scoring), Publishline (public advisories).
|
||||
|
||||
## License
|
||||
|
||||
Unset for the hackathon. Choose before any external release.
|
||||
Unset. Choose before any external release.
|
||||
|
||||
65
docs/demo.md
Normal file
65
docs/demo.md
Normal file
@@ -0,0 +1,65 @@
|
||||
# psyc — demo run-sheet
|
||||
|
||||
A ~5-minute walk-through of the platform; ~10 min including setup.
|
||||
|
||||
## 0. Setup (once)
|
||||
|
||||
```bash
|
||||
python3 -m virtualenv .venv
|
||||
.venv/bin/pip install -e .
|
||||
.venv/bin/psyc init
|
||||
```
|
||||
|
||||
## 1. Start the services
|
||||
|
||||
Separate terminals — the third is optional and needs an NVIDIA GPU:
|
||||
|
||||
```bash
|
||||
# terminal 1 — operator cockpit
|
||||
.venv/bin/psyc serve --port 8767
|
||||
|
||||
# terminal 2 — stand-in CERT / abuse-API receiver
|
||||
.venv/bin/psyc mock-cert --port 8770
|
||||
|
||||
# terminal 3 — live model behind the Classifier bot (optional)
|
||||
docker run --gpus all --rm -p 8771:8771 --entrypoint python \
|
||||
-v $(pwd)/data:/data -v $(pwd)/scripts:/scripts \
|
||||
psyc-trainer /scripts/serve_model.py --adapter /data/adapters/psyc-v4/final
|
||||
```
|
||||
|
||||
## 2. Run the pipeline
|
||||
|
||||
```bash
|
||||
.venv/bin/psyc fetch-all # ingest URLhaus + CISA KEV + Feodo Tracker
|
||||
.venv/bin/psyc demo # one case end-to-end; prints the cockpit links
|
||||
```
|
||||
|
||||
## 3. The walk-through
|
||||
|
||||
1. **Case Queue** — http://127.0.0.1:8767/cases
|
||||
30+ cases across three feeds, with severity + TLP badges. *"Three sources,
|
||||
one normalized case object."*
|
||||
|
||||
2. **Worker Mesh** — open the journey link `psyc demo` printed. This is the
|
||||
centerpiece: seven robot agents, a case token flowing through, each bot
|
||||
waking to perform its action and speak its real answer. Hit **▶ replay**.
|
||||
- **Classifier bot** carries a live verdict from the fine-tuned psyc-v4
|
||||
model — green when the model agrees with the rule, amber when it differs.
|
||||
- **Sealer** — evidence encrypted to authority public keys (PyNaCl sealed box).
|
||||
- **Router** — destinations cleared vs. policy-blocked (TLP ceiling, country).
|
||||
|
||||
3. **Ledger** — http://127.0.0.1:8767/ledger
|
||||
Every submission and every blocked route, immutably recorded.
|
||||
|
||||
4. **Trainline** — http://127.0.0.1:8767/train
|
||||
The four task datasets and the trained adapters with their loss curves.
|
||||
|
||||
## Talking points
|
||||
|
||||
- **Defensive only** — psyc never amplifies stolen data or contacts criminal
|
||||
actors; routing is gated by TLP, jurisdiction, and incident type.
|
||||
- **Rules + model** — deterministic work is rule-based; the fine-tuned model
|
||||
handles judgment. One bot is genuinely a live model, not animation over rules.
|
||||
- **Honest about limits** — psyc-v4 evals 7/8 on severity; the one miss is a
|
||||
documented data-scarcity case (one online-botnet example), not a bug, and was
|
||||
not gamed away.
|
||||
@@ -8,6 +8,7 @@ import typer
|
||||
import uvicorn
|
||||
|
||||
from psyc import db, log
|
||||
from psyc.cockpit import inference
|
||||
from psyc.lines import classify, courier, route, scout, seal, train
|
||||
from psyc.lines import map as map_line
|
||||
from psyc.models import Outcome
|
||||
@@ -315,8 +316,15 @@ def demo() -> None:
|
||||
for b in blocked:
|
||||
typer.echo(f" ⊘ {b.destination_name}: {b.reason}")
|
||||
typer.echo("")
|
||||
typer.echo(f"inspect: http://127.0.0.1:8767/cases/{case.case_id}")
|
||||
typer.echo(f"ledger: http://127.0.0.1:8767/ledger")
|
||||
typer.echo("── see it in the cockpit ──")
|
||||
typer.echo(f" Worker Mesh: http://127.0.0.1:8767/cases/{case.case_id}/journey")
|
||||
typer.echo(f" Case detail: http://127.0.0.1:8767/cases/{case.case_id}")
|
||||
typer.echo(f" Ledger: http://127.0.0.1:8767/ledger")
|
||||
adapter = inference.server_adapter()
|
||||
if adapter:
|
||||
typer.echo(f" Live model: up ({adapter}) — the Classifier bot shows its verdict")
|
||||
else:
|
||||
typer.echo(" Live model: inference server offline — Classifier bot falls back to rules")
|
||||
|
||||
|
||||
@app.command("serve")
|
||||
|
||||
Reference in New Issue
Block a user