Go to file

m17hr1l b95e3e02bd stage-3c: working QLoRA training + eval — pytorch base, Qwen3.5 slug, SFTConfig

Training and eval now run clean on the unsloth 2026.5.2 / transformers v5 /
torch 2.10 stack. Fixes: pytorch/pytorch base image (sidesteps the nvidia/cuda
apt-signature failure and the torch download), correct base-model slug
unsloth/Qwen3.5-4B, TRL SFTConfig API. Adds scripts/eval_adapter.py — runs
dataset rows through base+adapter with structured (transformers-v5) message
content and Qwen3.5 thinking-mode stripping.

First v1 adapter: loss 2.10 -> 0.32 over 3 epochs. Eval surfaced an ill-posed
ioc_extraction dataset (output URL not present in input) — to be fixed in the
ExampleBuilder before the next training run.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

2026-05-17 14:16:22 +02:00

docs

init: scaffold psyc — defensive CTI routing & evidence-sealing platform

2026-05-14 12:43:47 +02:00

scripts

stage-3c: working QLoRA training + eval — pytorch base, Qwen3.5 slug, SFTConfig

2026-05-17 14:16:22 +02:00

src/psyc

stage-3b: Trainline — JSONL dataset pipeline for QLoRA training

2026-05-14 14:15:58 +02:00

.dockerignore

stage-3c: unsloth QLoRA training scaffold for Qwen3.5

2026-05-14 14:17:14 +02:00

.gitignore

stage-2: full pipeline — Classifyline → Sealine → Routeline → Courier → Ledger + mock CERT

2026-05-14 13:44:43 +02:00

Dockerfile.train

stage-3c: working QLoRA training + eval — pytorch base, Qwen3.5 slug, SFTConfig

2026-05-17 14:16:22 +02:00

pyproject.toml

init: scaffold psyc — defensive CTI routing & evidence-sealing platform

2026-05-14 12:43:47 +02:00

README.md

stage-3c: working QLoRA training + eval — pytorch base, Qwen3.5 slug, SFTConfig

2026-05-17 14:16:22 +02:00

README.md

psyc

Validate the signal, protect the evidence, route only what each destination is authorized to receive, and prove every external action through an immutable ledger.

Defensive cyber-threat-intelligence routing & evidence-sealing platform. Built as a 48h hackathon project on 2026-05-13. Active development.

Architecture

Sensors
→ Scoutline      fetch, parse, dedup, signal
→ Proofline      validate indicators, score confidence
→ Mapline        resolve victim, actor, jurisdiction, CERT route
→ Classifyline   severity, TLP, incident type, internal class
→ Sealine        authority-sealed evidence encryption
→ Routeline      pick destinations, build payloads, submit
→ Ledgerline     immutable audit, receipts, outcomes
→ Publishline    sanitized public intelligence after mitigation
→ Trainline      lawful intel → LoRA-ready training data
→ Cockpit        operator UI (FastAPI + Jinja)

Each -line is a stage in a small-worker mesh; each worker performs one narrow job and passes a normalized Case object to the next stage. Heavy models are reserved for judgment-heavy tasks. Humans approve everything sensitive before it leaves the platform.

Full architecture: docs/dossier.md — consolidated read of the original individual records (still in docs/archive/).

Quick start

python3 -m virtualenv .venv
.venv/bin/pip install -e .

.venv/bin/psyc init                       # create the sqlite db
.venv/bin/psyc fetch-urlhaus --limit 50   # ingest a URLhaus pass
.venv/bin/psyc serve --port 8767          # cockpit at http://127.0.0.1:8767
.venv/bin/psyc status                     # count of ingested cases

Code layout

src/psyc/
  models.py          # normalized Case object (Pydantic)
  db.py              # SQLAlchemy Core; cases + ledger tables
  result.py          # Ok / Err / Result[T, E]
  log.py             # structlog configuration
  cli.py             # flat Typer commands
  lines/             # one file per worker line
    scout.py         # Fetcher + Signalizer (URLhaus today)
  cockpit/           # FastAPI + Jinja operator UI
    app.py
    templates/
    static/

docs/
  dossier.md         # full architecture (consolidated)
  style.md           # 12-fold Python style guide
  archive/           # original architecture docs + logo variants

Style

All code follows docs/style.md: Optional[X] / List[X] from typing, Field(default_factory=...) for Pydantic mutables, Result[T, E] types for expected failures (raise reserved for true exceptions), class X(str, Enum) for closed string sets, structlog with area.action event names, SQLAlchemy Core (no ORM), flat Typer commands with hyphenated names. Ruff config in pyproject.toml enforces the bits a linter can check; UP006/UP007/UP035 are disabled so the typing-import rules stand.

Scope

Lawful, white-hat defensive operations only. psyc routes intelligence to victims, CERT/CSIRTs, sector ISACs, provider/registrar abuse desks, and trusted CTI communities. It will not:

amplify stolen data
expose victims prematurely
interact with criminal actors
distribute exploitation content
submit evidence that exceeds a destination's max TLP

The boundaries are defined in docs/dossier.md §5 Destination Minimization, §10 TLP Enforcement, and §16 Public Reporting Rules. The Ledger records every external submission and destructive action; sensitive evidence is encrypted to authorized recipients via Sealine before any routing decision.

Training (Trainline + QLoRA)

psyc train-build-all emits Alpaca-style JSONL datasets under data/datasets/<task>-v<n>.jsonl for four defensive tasks: ioc_extraction, severity_classification, routing_decision, tlp_assignment. QualityGate drops TLP:RED, restricted sources, empty/oversize, and credential-leak rows per the dossier's training-data policy.

To fine-tune Qwen3.5-4B with QLoRA in an NVIDIA Docker container:

# 1. build datasets (one-off; re-run after ingestion changes)
.venv/bin/psyc train-build-all

# 2. build the training image (pytorch 2.6/CUDA 12.4 base + unsloth + Qwen3.5)
docker build -t psyc-trainer -f Dockerfile.train .

# 3. fine-tune (mount host data/ so adapters land there)
docker run --gpus all --rm \
    -v $(pwd)/data:/data \
    psyc-trainer \
    --dataset /data/datasets/ioc_extraction-v1.jsonl \
    --dataset /data/datasets/severity_classification-v1.jsonl \
    --dataset /data/datasets/routing_decision-v1.jsonl \
    --dataset /data/datasets/tlp_assignment-v1.jsonl \
    --output /data/adapters/psyc-v1

Defaults target a 24 GB consumer GPU (3090/4090): unsloth/Qwen3.5-4B at 4-bit, LoRA r=16/alpha=16, bf16, 3 epochs, effective batch size 8. For A100-40/80 bump --base-model unsloth/Qwen3.5-9B and raise --batch-size + --max-seq-length.

Output: data/adapters/psyc-v1/final/ (adapter weights) + training_meta.json (base model, hyperparameters, dataset list).

Evaluate the adapter against held-out dataset rows:

docker run --gpus all --rm \
    --entrypoint python \
    -v $(pwd)/data:/data -v $(pwd)/scripts:/scripts \
    psyc-trainer /scripts/eval_adapter.py \
    --adapter /data/adapters/psyc-v1/final \
    --dataset /data/datasets/ioc_extraction-v1.jsonl --n 5

Status

Day 2 of a 48h build. Shipped: Scoutline (URLhaus) → Classifyline → Mapline (GeoResolver via ip-api.com) → Sealine (PyNaCl sealed boxes) → Routeline → Courier → mock CERT → Ledgerline. Cockpit has cases / case detail / ledger pages and a design-token CSS layer. Trainline emits LoRA-ready JSONL; Dockerfile.train builds an unsloth + Qwen3.5 QLoRA training container.

License

Unset for the hackathon. Choose before any external release.

Languages

Python 62.6%

JavaScript 12.4%

HTML 12.1%

CSS 11.5%

Shell 1.3%