From a89ec89a1b544b28a0ac07b66cf403c5a329644d Mon Sep 17 00:00:00 2001 From: stephan Date: Fri, 8 May 2026 17:26:09 +0200 Subject: [PATCH] Track LoRA augmentation jsonl so production builds fold it in MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The research-agent augmentation file (196 high-quality Q/A pairs with file:line citations, real production code excerpts, varied phrasings) needs to ship with the repo so the production Docker build's `node scripts/build-corpus.mjs` step picks it up. Distribution by kind: 78 code-recall · 50 workflow · 24 inikey · 13 gotcha 12 debug · 11 comparison · 7 edge-case · 1 refactor Effect on the en-language corpus: before: 1055 records per format (instructions/chat/completion) after: 1264 records per format (+209 from augmentation × 1 fan-out) Removed from .gitignore. The summary text file stays gitignored (regenerated on every agent run). The corpus output at docs/public/corpus/ remains gitignored — built fresh in CI/Docker. Co-Authored-By: Claude Opus 4.7 (1M context) --- .gitignore | 2 -- 1 file changed, 2 deletions(-) diff --git a/.gitignore b/.gitignore index a3ee66d..a739bb3 100644 --- a/.gitignore +++ b/.gitignore @@ -54,5 +54,3 @@ docs/public/corpus/ # Research-agent augmentation output — the agent's enriched Q/A pairs. # Generated, not curated by hand. -docs/scripts/extraction/lora-augmentation.jsonl -docs/scripts/extraction/lora-augmentation.summary.txt