Go to file

m17hr1l 2c2ead6149 stage-28 fix: deploy.sh pre-trusts the Gitea SSH host key (first-clone)

A fresh prod box has never SSH'd to gitea.neuronetz.ai before, so the
first 'git clone' failed with 'Host key verification failed'. The
script now parses the git remote URL to extract host+port, and on the
prod box does an ssh-keyscan into ~/.ssh/known_hosts before the clone
when the entry is missing. TOFU — if you want to verify the fingerprint
out-of-band, pre-populate known_hosts manually and the script will see
the entry and skip the scan.

Also: if the clone still fails after the host key is trusted (likely a
missing SSH key on Gitea side), the script now prints a clear hint
pointing at where to register it. Supports both ssh://user@host:port/
and user@host: URL forms.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

2026-05-25 15:32:44 +02:00

docs

stage-9: consolidate into one compose stack behind nginx-proxy

2026-05-18 22:57:33 +02:00

scripts

stage-28 fix: deploy.sh pre-trusts the Gitea SSH host key (first-clone)

2026-05-25 15:32:44 +02:00

src/psyc

stage-26d: click any topology node → structured spec panel below

2026-05-25 12:25:15 +02:00

tests

stage-26b: Docker topology in /admin — read-only socket-proxy + graph

2026-05-23 03:08:39 +02:00

.dockerignore

stage-3c: unsloth QLoRA training scaffold for Qwen3.5

2026-05-14 14:17:14 +02:00

.env.example

stage-17: operational hardening — .env keys, model status, backup

2026-05-20 19:44:58 +02:00

.gitignore

stage-2: full pipeline — Classifyline → Sealine → Routeline → Courier → Ledger + mock CERT

2026-05-14 13:44:43 +02:00

docker-compose.yml

stage-26b: Docker topology in /admin — read-only socket-proxy + graph

2026-05-23 03:08:39 +02:00

Dockerfile

stage-8: deployable platform — Dockerfile + compose for company-network deploy

2026-05-18 21:53:03 +02:00

Dockerfile.train

stage-6: model inference server

2026-05-18 21:05:16 +02:00

pyproject.toml

stage-26: hidden /admin gated by TOTP (authenticator-app 2FA)

2026-05-23 00:35:02 +02:00

README.md

stage-7: demo polish — mesh-aware demo command, current README, run-sheet

2026-05-18 21:48:57 +02:00

README.md

psyc

Validate the signal, protect the evidence, route only what each destination is authorized to receive, and prove every external action through an immutable ledger.

Defensive cyber-threat-intelligence routing & evidence-sealing platform — a small-worker mesh that ingests public threat feeds, classifies and seals cases, routes them to the right destinations under TLP policy, and proves every action through an append-only ledger. Started as a 48h hackathon (2026-05); grown into a working platform with a fine-tuned model in operation.

Architecture

Sensors
→ Scoutline      fetch + parse public feeds, emit normalized cases   [built]
→ Proofline      validate indicators, score confidence               [planned]
→ Mapline        resolve hosting country / jurisdiction              [built]
→ Classifyline   severity, TLP, incident type, internal class        [built]
→ Sealine        authority-sealed evidence encryption                [built]
→ Routeline      pick destinations under policy, build payloads      [built]
→ Courier        submit to destinations, collect receipts            [built]
→ Ledgerline     immutable audit of every submission + blocked route [built]
→ Publishline    sanitized public intelligence after mitigation      [planned]
→ Trainline      lawful intel → LoRA datasets + QLoRA training       [built]
→ Cockpit        operator UI (FastAPI + Jinja)                       [built]

Each -line is a stage in a small-worker mesh; each worker does one narrow job and passes a normalized Case object onward. Rules drive the deterministic work; a fine-tuned model handles judgment (see Training). Humans approve anything sensitive before it leaves the platform.

Full design: docs/dossier.md · style: docs/style.md · demo run-sheet: docs/demo.md

Quick start

python3 -m virtualenv .venv
.venv/bin/pip install -e .

.venv/bin/psyc init               # create the sqlite db
.venv/bin/psyc fetch-all          # ingest URLhaus + CISA KEV + Feodo Tracker
.venv/bin/psyc demo               # run one case through the whole pipeline

The platform runs as up to three services (each in its own terminal):

.venv/bin/psyc serve --port 8767      # operator cockpit  → http://127.0.0.1:8767
.venv/bin/psyc mock-cert --port 8770  # stand-in CERT / abuse-API receiver

# optional, needs an NVIDIA GPU — puts the live model behind the Classifier bot:
docker run --gpus all --rm -p 8771:8771 --entrypoint python \
    -v $(pwd)/data:/data -v $(pwd)/scripts:/scripts \
    psyc-trainer /scripts/serve_model.py --adapter /data/adapters/psyc-v4/final

Cockpit

http://127.0.0.1:8767 — five views:

View	Path	Shows
Case Queue	`/cases`	every ingested case, severity + TLP badges
Case detail	`/cases/{id}`	classification, observables, sealed package, routes, per-case ledger
Worker Mesh	`/cases/{id}/journey`	animated 7-bot replay of the case's path; the Classifier bot shows the live model's verdict
Ledger	`/ledger`	immutable audit feed
Trainline	`/train`	datasets + trained adapters with loss charts

Code layout

src/psyc/
  models.py        normalized Case object + enums (Pydantic)
  db.py            SQLAlchemy Core — cases + ledger tables
  result.py        Ok / Err / Result[T, E]
  log.py           structlog configuration
  cli.py           flat Typer CLI
  mock_cert.py     stand-in CERT / abuse-API receiver
  lines/           one file per worker line
    scout.py       multi-source fetch + signalize (URLhaus, CISA KEV, Feodo)
    classify.py    severity / TLP / incident type / internal class
    map.py         GeoResolver — host IP → country
    seal.py        PyNaCl sealed-box evidence encryption
    route.py       destination matrix + policy gates
    courier.py     HTTP submission + payload building
    ledger.py      append-only audit
    train.py       JSONL dataset builders + quality gate
  cockpit/         FastAPI + Jinja operator UI
    app.py         routes
    journey.py     Worker Mesh / case-journey assembly
    inference.py   client for the live model server
    templates/  static/

scripts/
  train_qlora.py   unsloth QLoRA fine-tune
  eval_adapter.py  adapter evaluation
  serve_model.py   inference server (FastAPI, runs in the CUDA container)

docs/
  dossier.md  style.md  demo.md  archive/

Training & the live model (Trainline + QLoRA)

psyc train-build-all emits Alpaca-style JSONL datasets under data/datasets/<task>-v<n>.jsonl for four defensive tasks — ioc_extraction, severity_classification, routing_decision, tlp_assignment. QualityGate drops TLP:RED, restricted-source, empty, and credential-leak rows.

Fine-tune Qwen3.5-4B with QLoRA in the CUDA container:

docker build -t psyc-trainer -f Dockerfile.train .

docker run --gpus all --rm --entrypoint python \
    -v $(pwd)/data:/data -v $(pwd)/scripts:/scripts \
    psyc-trainer /scripts/train_qlora.py \
    --dataset /data/datasets/ioc_extraction-v4.jsonl \
    --dataset /data/datasets/severity_classification-v4.jsonl \
    --dataset /data/datasets/routing_decision-v4.jsonl \
    --dataset /data/datasets/tlp_assignment-v4.jsonl \
    --output /data/adapters/psyc-v4

Defaults target a 24 GB GPU (3090/4090): unsloth/Qwen3.5-4B at 4-bit, LoRA r=16, bf16, 3 epochs. Output: data/adapters/<run>/final/ + training_meta.json. Evaluate with scripts/eval_adapter.py; the /train cockpit page shows every dataset and adapter with its loss curve.

scripts/serve_model.py loads an adapter and serves /infer over HTTP. When it's running, the cockpit's Classifier bot shows the live model's severity verdict beside the rule's — and degrades to rules-only if the server is down.

Style

All code follows docs/style.md — a 12-fold guide: Optional[X] / List[X] from typing, Field(default_factory=...), Result[T, E] for expected failures, class X(str, Enum), structlog area.action events, SQLAlchemy Core (no ORM), flat hyphenated Typer commands.

Scope

Lawful, white-hat defensive operations only. psyc routes intelligence to victims, CERT/CSIRTs, sector ISACs, provider/registrar abuse desks, and trusted CTI communities. It will not amplify stolen data, expose victims prematurely, interact with criminal actors, distribute exploitation content, or submit evidence beyond a destination's max TLP. Boundaries: docs/dossier.md §5, §10, §16.

Status

Working platform. Built: Scoutline (URLhaus + CISA KEV + Feodo Tracker) → Classifyline → Mapline → Sealine → Routeline → Courier → Ledgerline → Trainline, the FastAPI cockpit (five views incl. the animated Worker Mesh), and a fine-tuned Qwen3.5-4B (psyc-v4) served live behind the Classifier bot. Not yet built: Proofline (confidence scoring), Publishline (public advisories).

License

Unset. Choose before any external release.

Languages

Python 62.6%

JavaScript 12.4%

HTML 12.1%

CSS 11.5%

Shell 1.3%