--- title: RAG plugin description: Ingest text, embed it, retrieve top-K, and answer grounded questions — all in one PHP class. --- The RAG plugin is the AI module's killer feature for product builders. It turns any pile of text — your help docs, your error logs, your Stripe invoices, your customer-support tickets — into a queryable knowledge base in roughly four lines of PHP. ## Three minutes, end-to-end ```php use Nibiru\Module\Ai\Ai; $ai = new Ai(); $rag = $ai->rag('product-help'); // a named collection $rag->ingestDir(__DIR__ . '/help/'); // walks .md/.txt/.php under help/ $rag->ingestText('FAQ entry…', ['source' => 'faq-12']); echo $rag->ask('How do I cancel my subscription?'); // → grounded answer, citing chunks like [1] [2] [3] ``` That's it. No vector DB. No SDK. No Python sidecar. ## How it works ``` ingestText / ingestFile / ingestDir ↓ chunk → embed (Ollama nomic-embed-text) ↓ pack vectors → JSON file at cache/rag/.json ↓ ask(question) → embed question → cosine top-K → chat with chunks as context ``` Storage is one JSON file per collection. Each chunk is an object with `text` + `metadata`; vectors are base64-packed Float32Array — about 3 KB per chunk. ~10k chunks fits comfortably in memory. ## Multiple collections You can have any number of collections in the same app. Each has its own JSON file. They share embedding model and chat model from `[AI]` config. ```php $docs = $ai->rag('docs'); $tickets = $ai->rag('support-tickets'); $logs = $ai->rag('error-logs'); $docs->ingestDir(__DIR__ . '/help/'); $tickets->ingestText($ticket->body, ['ticket_id' => $ticket->id]); $logs->ingestText($exception->__toString(), ['ts' => time()]); ``` ## API reference ```php $rag = $ai->rag('name'); // get/create a named collection // --- Ingestion --- $rag->ingestText($text, $metadata = []); // single chunk $count = $rag->ingestFile('path'); // returns chunks added $count = $rag->ingestDir('dir', ['md','txt','php']); // recursive // --- Querying --- $hits = $rag->search('query', $k = null); // [{score, text, metadata}, …] $answer = $rag->ask('question', $k = null); // top-K → chat call // --- Maintenance --- $rag->reset(); // forget everything (deletes file) $n = $rag->size(); // number of chunks ``` ## Tuning knobs In `application/module/ai/settings/ai.ini`: ```ini [AI] embed.model = "nomic-embed-text" ; or mxbai-embed-large for higher quality rag.top_k = 6 ; chunks injected into the chat call rag.chunk_target = 600 ; tokens per chunk (target) rag.chunk_min = 120 ; smaller chunks merged rag.chunk_max = 900 ; larger paragraphs split on sentences rag.storage_path = "/../../application/module/ai/cache/rag/" ``` ## When to use it - **Help / FAQ chat** — ingest your help articles, expose a `/ask` endpoint. - **In-app code search** — ingest `application/module/`, ask "where do we calculate VAT?" - **Internal docs assistant** — ingest your team's wiki dump. - **Customer-history lookups** — ingest tickets, ask "have we seen this error before?" ## When NOT to use it - **Real-time, write-heavy data** — RAG is a snapshot. For live data, write a [Tool](/en/ai/module/agent/) the agent can call. - **Massive corpora (> 100k chunks)** — JSON-file storage starts to creak. Move to Qdrant / pgvector / Weaviate; we'll publish an adapter once we need one ourselves. - **Anything where you need *exact* answers, not *probable* ones.** RAG is probabilistic. Don't use it as a database query layer. ## Common pitfalls - **`nomic-embed-text` not pulled.** The first `ingestText` call will fail with a clear error pointing you at the pull command. - **Embedding model mismatch.** Don't mix `nomic-embed-text` chunks with `mxbai-embed-large` queries — different vector spaces. If you change `embed.model`, run `$rag->reset()` first. - **Stale collections.** Re-running ingestDir doesn't dedupe. Use `reset()` then re-ingest, or maintain a content-hash check yourself. - **Tiny chunks.** Below ~80 tokens, embeddings get noisy. The default `rag.chunk_min = 120` merges small adjacent chunks. ## What's next - [Agent plugin →](/en/ai/module/agent/) for tools, not retrieval. - [Training nibiru-coder →](/en/ai/module/training/) to make the chat half answer in the framework's voice.