scaffold: project skeleton, schema, healthz/readyz, CI

Initial project structure for neuronetz-gateway per scope-docs/SPEC.md:

- Python 3.12 / FastAPI / SQLAlchemy 2.0 (async) / Redis / Postgres stack
  managed by uv. Multi-stage non-root Dockerfile, prod + dev compose files
  (ollama service is NEVER published in either), Caddyfile + systemd unit,
  justfile, GitHub Actions CI (ruff, mypy --strict, pytest, bandit, pip-audit).
- Pydantic-Settings config covering every env var from SPEC §7, including the
  MODEL_DISCOVERY_* keys for the dynamic-discovery feature (§4.6).
- Alembic 0001_initial creates the full gateway schema (8 tables, 3 enums,
  notify_key_revoked() trigger), incl. allow_all_models on tenant_limits and
  key_limits for the per-tenant auto-grant toggle.
- Working /healthz, /readyz (fail-closed when deps unreachable), and a
  Prometheus /metrics stub. Sanitizing error handlers that attach X-Request-ID
  to every response and never leak upstream internals.
- SPEC + AGENT_PROMPT included under scope-docs/ (source of truth).
This commit is contained in:
Stephan Berbig
2026-05-26 20:50:35 +02:00
commit d79f17b3bb
32 changed files with 3610 additions and 0 deletions

92
README.md Normal file
View File

@@ -0,0 +1,92 @@
# neuronetz-gateway
A secure, multi-tenant API gateway in front of an [Ollama](https://github.com/ollama/ollama)
instance. It is the hot path of the Neuronetz API: every request to the models flows
through here, authenticated, rate-limited, budgeted, and audited.
**The Ollama backend is never reachable from the public internet.** It is bound to an
internal Docker network with no published ports. All access is via this gateway, behind
TLS terminated by Caddy.
> Status: **v0.1.0 — in development.** See [`scope-docs/SPEC.md`](scope-docs/SPEC.md) for
> the full specification and [`scope-docs/AGENT_PROMPT.md`](scope-docs/AGENT_PROMPT.md) for
> the phased build plan. `SPEC.md` is the source of truth.
## What it does
- **Auth** — API keys as Bearer tokens, stored as Argon2id hashes, verified in constant time.
- **Multi-tenant** — tenants own keys; limits and budgets inherit tenant → key.
- **Rate limiting** — per-key and per-tenant RPM / TPM / concurrent connections.
- **Budgets** — daily / monthly / total token budgets, enforced fail-closed.
- **Dual API surface** — native Ollama (`/api/*`) and OpenAI-compatible (`/v1/*`), both streaming.
- **Hard-blocked mutations** — `/api/pull`, `/api/push`, `/api/create`, `/api/copy`,
`/api/delete`, `/api/blobs/*` always return 403. Not configurable.
- **Audit log** — always-on request metadata; opt-in, TTL'd prompt logging per key.
Administration (dashboards, tenant self-service) lives in a separate service,
`neuronetz-console`; it is **not** part of this repository.
## Architecture
```
Internet ──TLS──> Caddy ──HTTP──> gateway ──┬──> Postgres (keys, budgets, audit)
├──> Redis (key cache, rate limits)
└──> Ollama (internal network only)
```
## Quickstart (dev)
Requires Docker + Docker Compose. The dev stack runs Postgres, Redis, and the gateway —
**no Caddy and no Ollama** (so `/readyz` reports 503 until a real Ollama backend is wired
in; that is expected).
```bash
git clone <repo> neuronetz-gateway && cd neuronetz-gateway
cp .env.example .env # adjust if you like; defaults work for local dev
docker compose -f docker-compose.dev.yml up --build
```
The gateway runs `alembic upgrade head` on startup, then serves on `http://localhost:8080`.
```bash
curl -i http://localhost:8080/healthz # -> 200 {"status":"ok"}
curl -i http://localhost:8080/readyz # -> 503 (no Ollama backend in the dev stack)
```
## Production
`docker-compose.yml` brings up the full stack — Caddy (TLS via Let's Encrypt for
`api.neuronetz.ai`), the gateway, Postgres, Redis, and Ollama. The `ollama` service has
**no `ports:` mapping** and is reachable only on the internal Docker network. See
[`docs/DEPLOYMENT.md`](docs/DEPLOYMENT.md) (added in a later phase) and
[`ops/caddy/Caddyfile.example`](ops/caddy/Caddyfile.example).
## Managing tenants and keys
Use the bootstrap CLI (Typer). Keys have the form `nz_<prefix><secret>`; the full key is
printed exactly once at creation and only its Argon2id hash is stored.
```bash
neuronetz-gateway create-tenant --name acme
neuronetz-gateway create-key --tenant acme --name prod-server-1
neuronetz-gateway list-keys --tenant acme
neuronetz-gateway revoke-key --prefix nz_abc12345
```
## Development
```bash
just dev # run the dev stack
just test # pytest + coverage
just lint # ruff
just typecheck # mypy --strict
just migrate # alembic upgrade head
```
Tooling: Python 3.12, `uv`, FastAPI + uvicorn, SQLAlchemy 2.0 (async) + asyncpg, Redis,
httpx, structlog, Pydantic. Lint/type/security gates: ruff, mypy `--strict`, bandit,
pip-audit.
## License
Apache 2.0 — see [`LICENSE`](LICENSE). Owner: Stephan Berbig / Neuronetz.