m17hr1l/psyc - psyc - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
m17hr1l	2a9c0bf34a	stage-6: model inference server scripts/serve_model.py — FastAPI in the CUDA container, loads base Qwen3.5-4B + a psyc adapter once and serves POST /infer. Lets the cockpit (no torch in its venv) put a real fine-tuned model behind a Worker Mesh bot over HTTP. Dockerfile.train gains a fastapi + uvicorn layer. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-18 21:05:16 +02:00
m17hr1l	b4c66c2e87	stage-3e: well-posed ioc_extraction dataset + clearer /train page ioc_extraction ExampleBuilder now embeds every IOC into the advisory text so the extraction task is answerable from the input (v1 asked the model to "extract" a URL that was never given). /train page distinguishes trained / training… / not-started, and renders a per-step loss bar chart. Dockerfile no longer bakes the training script — scripts/ is mounted at run time so edits take effect without a 21 GB rebuild (this is why psyc-v2's loss capture was silently skipped on its first run). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-17 18:09:37 +02:00
m17hr1l	b95e3e02bd	stage-3c: working QLoRA training + eval — pytorch base, Qwen3.5 slug, SFTConfig Training and eval now run clean on the unsloth 2026.5.2 / transformers v5 / torch 2.10 stack. Fixes: pytorch/pytorch base image (sidesteps the nvidia/cuda apt-signature failure and the torch download), correct base-model slug unsloth/Qwen3.5-4B, TRL SFTConfig API. Adds scripts/eval_adapter.py — runs dataset rows through base+adapter with structured (transformers-v5) message content and Qwen3.5 thinking-mode stripping. First v1 adapter: loss 2.10 -> 0.32 over 3 epochs. Eval surfaced an ill-posed ioc_extraction dataset (output URL not present in input) — to be fixed in the ExampleBuilder before the next training run. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-17 14:16:22 +02:00
m17hr1l	f1ab11f89d	stage-3c: unsloth QLoRA training scaffold for Qwen3.5 Dockerfile.train builds a CUDA 12.4 + unsloth container that consumes the Trainline JSONL datasets and emits a LoRA adapter at data/adapters/<run>/final. Defaults target a 24 GB GPU (Qwen3.5-4B-Instruct-bnb-4bit, r=16, bf16, 3 epochs, effective batch 8). README documents the build + run workflow. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-14 14:17:14 +02:00

4 Commits