Track LoRA augmentation jsonl so production builds fold it in

The research-agent augmentation file (196 high-quality Q/A pairs with
file:line citations, real production code excerpts, varied phrasings)
needs to ship with the repo so the production Docker build's
`node scripts/build-corpus.mjs` step picks it up.

Distribution by kind:
  78 code-recall · 50 workflow · 24 inikey · 13 gotcha
  12 debug · 11 comparison · 7 edge-case · 1 refactor

Effect on the en-language corpus:
  before: 1055 records per format (instructions/chat/completion)
  after:  1264 records per format  (+209 from augmentation × 1 fan-out)

Removed from .gitignore. The summary text file stays gitignored
(regenerated on every agent run). The corpus output at
docs/public/corpus/ remains gitignored — built fresh in CI/Docker.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
stephan
2026-05-08 17:26:09 +02:00
parent f4ccc45a3b
commit a89ec89a1b

2
.gitignore vendored
View File

@@ -54,5 +54,3 @@ docs/public/corpus/
# Research-agent augmentation output — the agent's enriched Q/A pairs.
# Generated, not curated by hand.
docs/scripts/extraction/lora-augmentation.jsonl
docs/scripts/extraction/lora-augmentation.summary.txt