psyc

# psyc > Validate the signal, protect the evidence, route only what each destination is > authorized to receive, and prove every external action through an immutable ledger. Defensive cyber-threat-intelligence routing & evidence-sealing platform — a small-worker mesh that ingests public threat feeds, classifies and seals cases, routes them to the right destinations under TLP policy, and proves every action through an append-only ledger. Started as a 48h hackathon (2026-05); grown into a working platform with a fine-tuned model in operation. --- ## Architecture ```text Sensors → Scoutline fetch + parse public feeds, emit normalized cases [built] → Proofline validate indicators, score confidence [planned] → Mapline resolve hosting country / jurisdiction [built] → Classifyline severity, TLP, incident type, internal class [built] → Sealine authority-sealed evidence encryption [built] → Routeline pick destinations under policy, build payloads [built] → Courier submit to destinations, collect receipts [built] → Ledgerline immutable audit of every submission + blocked route [built] → Publishline sanitized public intelligence after mitigation [planned] → Trainline lawful intel → LoRA datasets + QLoRA training [built] → Cockpit operator UI (FastAPI + Jinja) [built] ``` Each `-line` is a stage in a small-worker mesh; each worker does one narrow job and passes a normalized `Case` object onward. Rules drive the deterministic work; a fine-tuned model handles judgment (see Training). Humans approve anything sensitive before it leaves the platform. Full design: [`docs/dossier.md`](docs/dossier.md) · style: [`docs/style.md`](docs/style.md) · demo run-sheet: [`docs/demo.md`](docs/demo.md) --- ## Quick start ```bash python3 -m virtualenv .venv .venv/bin/pip install -e . .venv/bin/psyc init # create the sqlite db .venv/bin/psyc fetch-all # ingest URLhaus + CISA KEV + Feodo Tracker .venv/bin/psyc demo # run one case through the whole pipeline ``` The platform runs as up to three services (each in its own terminal): ```bash .venv/bin/psyc serve --port 8767 # operator cockpit → http://127.0.0.1:8767 .venv/bin/psyc mock-cert --port 8770 # stand-in CERT / abuse-API receiver # optional, needs an NVIDIA GPU — puts the live model behind the Classifier bot: docker run --gpus all --rm -p 8771:8771 --entrypoint python \ -v $(pwd)/data:/data -v $(pwd)/scripts:/scripts \ psyc-trainer /scripts/serve_model.py --adapter /data/adapters/psyc-v4/final ``` --- ## Cockpit `http://127.0.0.1:8767` — five views: | View | Path | Shows | |---|---|---| | Case Queue | `/cases` | every ingested case, severity + TLP badges | | Case detail | `/cases/{id}` | classification, observables, sealed package, routes, per-case ledger | | Worker Mesh | `/cases/{id}/journey` | animated 7-bot replay of the case's path; the Classifier bot shows the live model's verdict | | Ledger | `/ledger` | immutable audit feed | | Trainline | `/train` | datasets + trained adapters with loss charts | --- ## Code layout ``` src/psyc/ models.py normalized Case object + enums (Pydantic) db.py SQLAlchemy Core — cases + ledger tables result.py Ok / Err / Result[T, E] log.py structlog configuration cli.py flat Typer CLI mock_cert.py stand-in CERT / abuse-API receiver lines/ one file per worker line scout.py multi-source fetch + signalize (URLhaus, CISA KEV, Feodo) classify.py severity / TLP / incident type / internal class map.py GeoResolver — host IP → country seal.py PyNaCl sealed-box evidence encryption route.py destination matrix + policy gates courier.py HTTP submission + payload building ledger.py append-only audit train.py JSONL dataset builders + quality gate cockpit/ FastAPI + Jinja operator UI app.py routes journey.py Worker Mesh / case-journey assembly inference.py client for the live model server templates/ static/ scripts/ train_qlora.py unsloth QLoRA fine-tune eval_adapter.py adapter evaluation serve_model.py inference server (FastAPI, runs in the CUDA container) docs/ dossier.md style.md demo.md archive/ ``` --- ## Training & the live model (Trainline + QLoRA) `psyc train-build-all` emits Alpaca-style JSONL datasets under `data/datasets/-v.jsonl` for four defensive tasks — `ioc_extraction`, `severity_classification`, `routing_decision`, `tlp_assignment`. QualityGate drops TLP:RED, restricted-source, empty, and credential-leak rows. Fine-tune Qwen3.5-4B with QLoRA in the CUDA container: ```bash docker build -t psyc-trainer -f Dockerfile.train . docker run --gpus all --rm --entrypoint python \ -v $(pwd)/data:/data -v $(pwd)/scripts:/scripts \ psyc-trainer /scripts/train_qlora.py \ --dataset /data/datasets/ioc_extraction-v4.jsonl \ --dataset /data/datasets/severity_classification-v4.jsonl \ --dataset /data/datasets/routing_decision-v4.jsonl \ --dataset /data/datasets/tlp_assignment-v4.jsonl \ --output /data/adapters/psyc-v4 ``` Defaults target a 24 GB GPU (3090/4090): `unsloth/Qwen3.5-4B` at 4-bit, LoRA r=16, bf16, 3 epochs. Output: `data/adapters//final/` + `training_meta.json`. Evaluate with `scripts/eval_adapter.py`; the `/train` cockpit page shows every dataset and adapter with its loss curve. `scripts/serve_model.py` loads an adapter and serves `/infer` over HTTP. When it's running, the cockpit's **Classifier bot** shows the live model's severity verdict beside the rule's — and degrades to rules-only if the server is down. --- ## Style All code follows [`docs/style.md`](docs/style.md) — a 12-fold guide: `Optional[X]` / `List[X]` from `typing`, `Field(default_factory=...)`, `Result[T, E]` for expected failures, `class X(str, Enum)`, structlog `area.action` events, SQLAlchemy Core (no ORM), flat hyphenated Typer commands. --- ## Scope **Lawful, white-hat defensive operations only.** psyc routes intelligence to victims, CERT/CSIRTs, sector ISACs, provider/registrar abuse desks, and trusted CTI communities. It will **not** amplify stolen data, expose victims prematurely, interact with criminal actors, distribute exploitation content, or submit evidence beyond a destination's max TLP. Boundaries: `docs/dossier.md` §5, §10, §16. --- ## Status Working platform. Built: Scoutline (URLhaus + CISA KEV + Feodo Tracker) → Classifyline → Mapline → Sealine → Routeline → Courier → Ledgerline → Trainline, the FastAPI cockpit (five views incl. the animated Worker Mesh), and a fine-tuned Qwen3.5-4B (psyc-v4) served live behind the Classifier bot. Not yet built: Proofline (confidence scoring), Publishline (public advisories). ## License Unset. Choose before any external release.