This is the snapshot the production landing site (nibiru-framework.com) is deployed from. Brings together the recent splash + docs migration to the v4 "Cosmos" design system, the new in-framework AI module, and the framework groundwork that backs the framework-reference extraction. What lands: - docs/: Astro + Starlight site with the v4 dark cosmic palette, GalaxyHero canvas constellation, Mission Control chat (wired to /api/oracle → api.neuronetz.ai via providers.mjs Ollama), 5-panel MMVC stage (Model · AI · Module · Controller · View), translated EN/DE/JA/ES/FR content, PWA + sitemap + llms.txt + Umami analytics. - docs/design-system/: canonical mockup bundle (source/index-v2.html for splash, source/docs-system.html + preview/ for docs, SPEC.md, tokens). - docs/scripts/extraction/framework-reference-v2.md: deep framework reference (~1.6k lines, file:line citations, every public factory and idiom — basis for the LoRA training corpus. - application/module/ai/: AI module with chat / embed / RAG / agent plugins, plus pdoQuery / httpGet / fileRead tools and Modelfile + smoke-test in training/. - application/module/users/: user / ACL / form-factory traits used as the reference plugin pattern for the framework docs. - application/settings/config/database/: schema + seed migrations including the AI module tables (200–203). - Form factory + autogenerator changes the framework-reference-v2 covers. Production secrets stay out: docs/.env, settings.production.ini and ai.production.ini are all gitignored (.example files are in tree). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4.4 KiB
4.4 KiB
title, description
| title | description |
|---|---|
| RAG plugin | Ingest text, embed it, retrieve top-K, and answer grounded questions — all in one PHP class. |
The RAG plugin is the AI module's killer feature for product builders. It turns any pile of text — your help docs, your error logs, your Stripe invoices, your customer-support tickets — into a queryable knowledge base in roughly four lines of PHP.
Three minutes, end-to-end
use Nibiru\Module\Ai\Ai;
$ai = new Ai();
$rag = $ai->rag('product-help'); // a named collection
$rag->ingestDir(__DIR__ . '/help/'); // walks .md/.txt/.php under help/
$rag->ingestText('FAQ entry…', ['source' => 'faq-12']);
echo $rag->ask('How do I cancel my subscription?');
// → grounded answer, citing chunks like [1] [2] [3]
That's it. No vector DB. No SDK. No Python sidecar.
How it works
ingestText / ingestFile / ingestDir
↓
chunk → embed (Ollama nomic-embed-text)
↓
pack vectors → JSON file at cache/rag/<collection>.json
↓
ask(question) → embed question → cosine top-K → chat with chunks as context
Storage is one JSON file per collection. Each chunk is an object with text + metadata; vectors are base64-packed Float32Array — about 3 KB per chunk. ~10k chunks fits comfortably in memory.
Multiple collections
You can have any number of collections in the same app. Each has its own JSON file. They share embedding model and chat model from [AI] config.
$docs = $ai->rag('docs');
$tickets = $ai->rag('support-tickets');
$logs = $ai->rag('error-logs');
$docs->ingestDir(__DIR__ . '/help/');
$tickets->ingestText($ticket->body, ['ticket_id' => $ticket->id]);
$logs->ingestText($exception->__toString(), ['ts' => time()]);
API reference
$rag = $ai->rag('name'); // get/create a named collection
// --- Ingestion ---
$rag->ingestText($text, $metadata = []); // single chunk
$count = $rag->ingestFile('path'); // returns chunks added
$count = $rag->ingestDir('dir', ['md','txt','php']); // recursive
// --- Querying ---
$hits = $rag->search('query', $k = null); // [{score, text, metadata}, …]
$answer = $rag->ask('question', $k = null); // top-K → chat call
// --- Maintenance ---
$rag->reset(); // forget everything (deletes file)
$n = $rag->size(); // number of chunks
Tuning knobs
In application/module/ai/settings/ai.ini:
[AI]
embed.model = "nomic-embed-text" ; or mxbai-embed-large for higher quality
rag.top_k = 6 ; chunks injected into the chat call
rag.chunk_target = 600 ; tokens per chunk (target)
rag.chunk_min = 120 ; smaller chunks merged
rag.chunk_max = 900 ; larger paragraphs split on sentences
rag.storage_path = "/../../application/module/ai/cache/rag/"
When to use it
- Help / FAQ chat — ingest your help articles, expose a
/askendpoint. - In-app code search — ingest
application/module/, ask "where do we calculate VAT?" - Internal docs assistant — ingest your team's wiki dump.
- Customer-history lookups — ingest tickets, ask "have we seen this error before?"
When NOT to use it
- Real-time, write-heavy data — RAG is a snapshot. For live data, write a Tool the agent can call.
- Massive corpora (> 100k chunks) — JSON-file storage starts to creak. Move to Qdrant / pgvector / Weaviate; we'll publish an adapter once we need one ourselves.
- Anything where you need exact answers, not probable ones. RAG is probabilistic. Don't use it as a database query layer.
Common pitfalls
nomic-embed-textnot pulled. The firstingestTextcall will fail with a clear error pointing you at the pull command.- Embedding model mismatch. Don't mix
nomic-embed-textchunks withmxbai-embed-largequeries — different vector spaces. If you changeembed.model, run$rag->reset()first. - Stale collections. Re-running ingestDir doesn't dedupe. Use
reset()then re-ingest, or maintain a content-hash check yourself. - Tiny chunks. Below ~80 tokens, embeddings get noisy. The default
rag.chunk_min = 120merges small adjacent chunks.
What's next
- Agent plugin → for tools, not retrieval.
- Training nibiru-coder → to make the chat half answer in the framework's voice.