Initial project structure for neuronetz-gateway per scope-docs/SPEC.md: - Python 3.12 / FastAPI / SQLAlchemy 2.0 (async) / Redis / Postgres stack managed by uv. Multi-stage non-root Dockerfile, prod + dev compose files (ollama service is NEVER published in either), Caddyfile + systemd unit, justfile, GitHub Actions CI (ruff, mypy --strict, pytest, bandit, pip-audit). - Pydantic-Settings config covering every env var from SPEC §7, including the MODEL_DISCOVERY_* keys for the dynamic-discovery feature (§4.6). - Alembic 0001_initial creates the full gateway schema (8 tables, 3 enums, notify_key_revoked() trigger), incl. allow_all_models on tenant_limits and key_limits for the per-tenant auto-grant toggle. - Working /healthz, /readyz (fail-closed when deps unreachable), and a Prometheus /metrics stub. Sanitizing error handlers that attach X-Request-ID to every response and never leak upstream internals. - SPEC + AGENT_PROMPT included under scope-docs/ (source of truth).
93 lines
3.8 KiB
Markdown
93 lines
3.8 KiB
Markdown
# neuronetz-gateway
|
|
|
|
A secure, multi-tenant API gateway in front of an [Ollama](https://github.com/ollama/ollama)
|
|
instance. It is the hot path of the Neuronetz API: every request to the models flows
|
|
through here, authenticated, rate-limited, budgeted, and audited.
|
|
|
|
**The Ollama backend is never reachable from the public internet.** It is bound to an
|
|
internal Docker network with no published ports. All access is via this gateway, behind
|
|
TLS terminated by Caddy.
|
|
|
|
> Status: **v0.1.0 — in development.** See [`scope-docs/SPEC.md`](scope-docs/SPEC.md) for
|
|
> the full specification and [`scope-docs/AGENT_PROMPT.md`](scope-docs/AGENT_PROMPT.md) for
|
|
> the phased build plan. `SPEC.md` is the source of truth.
|
|
|
|
## What it does
|
|
|
|
- **Auth** — API keys as Bearer tokens, stored as Argon2id hashes, verified in constant time.
|
|
- **Multi-tenant** — tenants own keys; limits and budgets inherit tenant → key.
|
|
- **Rate limiting** — per-key and per-tenant RPM / TPM / concurrent connections.
|
|
- **Budgets** — daily / monthly / total token budgets, enforced fail-closed.
|
|
- **Dual API surface** — native Ollama (`/api/*`) and OpenAI-compatible (`/v1/*`), both streaming.
|
|
- **Hard-blocked mutations** — `/api/pull`, `/api/push`, `/api/create`, `/api/copy`,
|
|
`/api/delete`, `/api/blobs/*` always return 403. Not configurable.
|
|
- **Audit log** — always-on request metadata; opt-in, TTL'd prompt logging per key.
|
|
|
|
Administration (dashboards, tenant self-service) lives in a separate service,
|
|
`neuronetz-console`; it is **not** part of this repository.
|
|
|
|
## Architecture
|
|
|
|
```
|
|
Internet ──TLS──> Caddy ──HTTP──> gateway ──┬──> Postgres (keys, budgets, audit)
|
|
├──> Redis (key cache, rate limits)
|
|
└──> Ollama (internal network only)
|
|
```
|
|
|
|
## Quickstart (dev)
|
|
|
|
Requires Docker + Docker Compose. The dev stack runs Postgres, Redis, and the gateway —
|
|
**no Caddy and no Ollama** (so `/readyz` reports 503 until a real Ollama backend is wired
|
|
in; that is expected).
|
|
|
|
```bash
|
|
git clone <repo> neuronetz-gateway && cd neuronetz-gateway
|
|
cp .env.example .env # adjust if you like; defaults work for local dev
|
|
docker compose -f docker-compose.dev.yml up --build
|
|
```
|
|
|
|
The gateway runs `alembic upgrade head` on startup, then serves on `http://localhost:8080`.
|
|
|
|
```bash
|
|
curl -i http://localhost:8080/healthz # -> 200 {"status":"ok"}
|
|
curl -i http://localhost:8080/readyz # -> 503 (no Ollama backend in the dev stack)
|
|
```
|
|
|
|
## Production
|
|
|
|
`docker-compose.yml` brings up the full stack — Caddy (TLS via Let's Encrypt for
|
|
`api.neuronetz.ai`), the gateway, Postgres, Redis, and Ollama. The `ollama` service has
|
|
**no `ports:` mapping** and is reachable only on the internal Docker network. See
|
|
[`docs/DEPLOYMENT.md`](docs/DEPLOYMENT.md) (added in a later phase) and
|
|
[`ops/caddy/Caddyfile.example`](ops/caddy/Caddyfile.example).
|
|
|
|
## Managing tenants and keys
|
|
|
|
Use the bootstrap CLI (Typer). Keys have the form `nz_<prefix><secret>`; the full key is
|
|
printed exactly once at creation and only its Argon2id hash is stored.
|
|
|
|
```bash
|
|
neuronetz-gateway create-tenant --name acme
|
|
neuronetz-gateway create-key --tenant acme --name prod-server-1
|
|
neuronetz-gateway list-keys --tenant acme
|
|
neuronetz-gateway revoke-key --prefix nz_abc12345
|
|
```
|
|
|
|
## Development
|
|
|
|
```bash
|
|
just dev # run the dev stack
|
|
just test # pytest + coverage
|
|
just lint # ruff
|
|
just typecheck # mypy --strict
|
|
just migrate # alembic upgrade head
|
|
```
|
|
|
|
Tooling: Python 3.12, `uv`, FastAPI + uvicorn, SQLAlchemy 2.0 (async) + asyncpg, Redis,
|
|
httpx, structlog, Pydantic. Lint/type/security gates: ruff, mypy `--strict`, bandit,
|
|
pip-audit.
|
|
|
|
## License
|
|
|
|
Apache 2.0 — see [`LICENSE`](LICENSE). Owner: Stephan Berbig / Neuronetz.
|