Files
neuronetz-gateway/docs/PLAYGROUND.md
Stephan Berbig b47a09db91 demo + playground + docs
One-command demo so the gateway can be exercised end-to-end without a GPU or a
real model download:

- demo/mock-ollama/ — tiny FastAPI service emulating Ollama (/api/tags,
  /api/chat + /api/generate NDJSON streaming with realistic prompt_eval_count
  and eval_count on the final frame, /api/embed, /api/show, /api/version).
  Non-root multi-stage Dockerfile, never published (internal network only).
- docker-compose.demo.yml — postgres + redis + mock-ollama + gateway, with
  PLAYGROUND_ENABLED=true and ./playground mounted read-only at /app/playground.
  Mirrors the prod posture (mock-ollama not exposed).
- demo.sh — brings the stack up, waits on /healthz, creates a demo tenant with
  allow_all_models and a fresh API key via the bootstrap CLI inside the
  container, then prints the key, the playground URL, and five ready-to-paste
  curl commands (SSE chat, NDJSON chat, /v1/models, a 401, a 403 /api/pull).
  ./demo.sh --down tears everything back down with volumes.
- playground/index.html — single-file dark-themed UI served same-origin by
  the gateway at /playground (CORS-free). Per-endpoint About card with method/
  auth/streaming badges, a real description, sample request body, sample
  response, and a footer note. Live SSE/NDJSON rendering of the response.
  A live, copyable curl box that mirrors exactly what Run sends. Run + Refresh
  are visibly gated until an API key is in the field; the Base URL is
  force-pinned to location.origin three times to defeat browser autofill.
- docs/ — API.md (full endpoint reference with curl, streaming formats, error
  model, SPEC §6.5 response headers), ARCHITECTURE.md (incl. §4.6 discovery
  + the request lifecycle), DEPLOYMENT.md (Ollama-never-exposed rule,
  pointing at a real Ollama backend, env reference), THREAT_MODEL.md
  (SPEC §3 table + the allow_all_models opt-in notes), OPERATIONS.md
  (key/budget/model/usage runbook + fail-closed table), PLAYGROUND.md.
  mkdocs.yml (Material theme) wires them together.
2026-05-26 20:52:33 +02:00

114 lines
4.4 KiB
Markdown

# neuronetz-gateway — Demo & Playground
The fastest way to see the gateway working end-to-end, with **no GPU and no model downloads**.
`./demo.sh` brings up the gateway against a mock Ollama backend, mints a demo API key, and
prints ready-to-paste curl commands and a link to an interactive browser playground.
---
## Launch the demo
From the repo root:
```bash
./demo.sh
```
This will:
1. Build and start the demo stack (`docker-compose.demo.yml`): **postgres + redis +
mock-ollama + gateway**. No Caddy; the gateway is published on `127.0.0.1:8080`.
2. Wait for the gateway to report healthy at `/healthz`.
3. Create a demo tenant (`--allow-all-models`) and an API key via the bootstrap CLI **inside
the gateway container**, capturing the key (which is printed exactly once).
4. Print a summary: the **API key**, the **playground URL**
`http://localhost:8080/playground`, and five ready-to-paste curl commands —
- streaming `/v1/chat/completions` (OpenAI SSE),
- streaming `/api/chat` (native NDJSON),
- `GET /v1/models`,
- a **401** example (no/bad key),
- a **403** example (`POST /api/pull`, hard-blocked).
The script is **re-runnable**: an existing tenant is reused, and each run mints a fresh,
uniquely-named key (the full key only ever prints at creation).
Tear everything down (containers + volumes):
```bash
./demo.sh --down
```
### What's running
| Service | Exposed? | Notes |
|---|---|---|
| `gateway` | `127.0.0.1:8080` | The real gateway image, built from the repo `Dockerfile`. |
| `mock-ollama` | **no** | Internal network only — mirrors the prod "Ollama is never exposed" rule. |
| `postgres` | **no** | Internal only. |
| `redis` | **no** | Internal only. |
The mock backend (`demo/mock-ollama/`) emulates Ollama's API shapes — including realistic
`prompt_eval_count` / `eval_count` on the final stream object — so token counting, model
discovery, and `/api/show` sanitization all exercise real gateway code paths. It serves a
small catalogue: `llama3.1:8b`, `mistral:7b`, `qwen2.5:3b`, `nomic-embed-text`.
---
## Use the playground
Open **http://localhost:8080/playground** in a browser. It is a single self-contained HTML
page, served **same-origin** by the gateway (so no CORS to worry about).
1. **Base URL** is pre-filled with the current origin; leave it as is for the demo.
2. Paste the **API key** from the `./demo.sh` output into the Bearer field. (Typing a key
auto-loads the model dropdown; you can also hit **↻ Refresh**.)
3. Pick an **endpoint** tab: `/v1/chat/completions`, `/api/chat`, `/api/generate`,
`/v1/models`, `/api/tags`, `/healthz`, `/readyz`.
4. Choose a **model** from the auto-populated dropdown, type a prompt, toggle **stream**.
5. Hit **▶ Run**. The streamed output renders **live** — SSE `data:` deltas (incl. `[DONE]`)
for `/v1/*`, NDJSON lines for `/api/*`.
6. The panel shows the **response status** and the rate-limit / budget **response headers**
(`X-Request-ID`, `X-RateLimit-*`, `X-Budget-*`; SPEC §6.5).
7. The **Exact curl** box mirrors precisely what **Run** sends — copy it to reproduce in a
terminal.
Try the 403 path too: there's no mutating-endpoint tab by design, but the printed `curl` for
`POST /api/pull` shows the hard block, and an invalid key in the Bearer field demonstrates the
401 fail-closed response.
---
## ⚠️ Security note: the playground is OFF by default in production
The playground route is **flag-gated** and **disabled by default**. The demo stack turns it on
explicitly:
```yaml
# docker-compose.demo.yml (gateway service)
GATEWAY_PLAYGROUND_ENABLED: "true"
GATEWAY_PLAYGROUND_FILE: /app/playground/index.html
```
with the file mounted read-only into the container:
```yaml
volumes:
- ./playground:/app/playground:ro
```
The production stack (`docker-compose.yml`) does **not** set `GATEWAY_PLAYGROUND_ENABLED`, so
the route is absent. Do not enable it on a public deployment: it is a convenience for demos and
local development, not a production surface. Leaving it off keeps the public attack surface to
the documented API only.
---
## Files behind the demo
| Path | What it is |
|---|---|
| `demo.sh` | The one-command entrypoint (up / `--down`). |
| `docker-compose.demo.yml` | The demo stack definition. |
| `demo/mock-ollama/` | The standalone mock Ollama service (FastAPI app + Dockerfile). |
| `playground/index.html` | The self-contained browser playground served at `/playground`. |