m17hr1l/neuronetz-gateway

Fork 0

Go to file

Stephan Berbig c9e11c3486

CI / ruff (push) Has been cancelled

Details

CI / mypy --strict (push) Has been cancelled

Details

CI / pytest (push) Has been cancelled

Details

CI / bandit (push) Has been cancelled

Details

CI / pip-audit (push) Has been cancelled

Details

cli: add add-backend / remove-backend / list-backends commands

So nobody ever has to hand-write the OLLAMA_BACKENDS JSON again.

  # add a backend, probe it, print the resulting .env line:
  neuronetz-gateway add-backend embedded     http://ollama:11434
  neuronetz-gateway add-backend neuro-ollama http://neuro-ollama:11434 --token ABC

  # update one (e.g. rotate token):
  neuronetz-gateway add-backend neuro-ollama http://neuro-ollama:11434 --token XYZ --replace

  # remove:
  neuronetz-gateway remove-backend neuro-ollama

  # peek (tokens redacted):
  neuronetz-gateway list-backends

  # write directly to a .env file (atomic temp-file + rename):
  neuronetz-gateway add-backend foo http://foo:11434 --token T --write-env /app/.env

  # show what would change without doing it:
  neuronetz-gateway add-backend foo http://foo:11434 --token T --dry-run

What each command does:

- `add-backend NAME URL` (+ optional --token / --header / --scheme / --replace
  / --no-validate / --write-env / --dry-run): builds a new backend list (current
  list parsed from OLLAMA_BACKENDS env, or synthesized from the single-backend
  fallback if unset), validates the new backend by probing /api/tags with the
  same headers the gateway will use at runtime (`build_backend_headers`), then
  prints the resulting OLLAMA_BACKENDS=... line ready to paste — or writes it
  in place if --write-env is given. Refuses to overwrite an existing name
  unless --replace is passed.
- `remove-backend NAME` (+ --write-env / --dry-run): mirror of add-backend for
  removal.
- `list-backends`: shows the configured backends with tokens redacted to
  "***" via `redacted_dump`. Useful sanity check after editing .env.

All the JSON manipulation is in a new pure-helpers module
`cli/backends.py` (parse / serialize / add_or_replace / remove /
update_env_file). The Typer commands in `cli/manage.py` are thin shells
on top — the logic is unit-tested directly without spinning up Typer or
the network. The token is unwrapped from SecretStr exactly once at the
serialization boundary (`to_dict`) and never logged.

New tests (16): full coverage of the helpers — round-trip serialize/parse,
duplicate-name rejection, replace-in-place order preservation, remove
on unknown name, redaction, atomic env-file rewrite (insert / replace /
idempotent re-apply / create-when-missing).

ruff (incl. the per-file ignore add for tests' S105/S106 — placeholder
"tok123"-style strings are inputs, not credentials) + mypy --strict (68
source files) clean. pytest: 76 passed + 39 skipped (the 16 new tests +
no regressions on the existing 60).

2026-05-27 22:59:53 +02:00

.github/workflows

scaffold: project skeleton, schema, healthz/readyz, CI

2026-05-26 20:50:35 +02:00

alembic

scaffold: project skeleton, schema, healthz/readyz, CI

2026-05-26 20:50:35 +02:00

demo/mock-ollama

demo + playground + docs

2026-05-26 20:52:33 +02:00

docs

deploy: target jwilder-proxy production stack

2026-05-26 20:55:20 +02:00

ops

scaffold: project skeleton, schema, healthz/readyz, CI

2026-05-26 20:50:35 +02:00

playground

demo + playground + docs

2026-05-26 20:52:33 +02:00

scope-docs

scaffold: project skeleton, schema, healthz/readyz, CI

2026-05-26 20:50:35 +02:00

src/neuronetz_gateway

cli: add add-backend / remove-backend / list-backends commands

2026-05-27 22:59:53 +02:00

tests

cli: add add-backend / remove-backend / list-backends commands

2026-05-27 22:59:53 +02:00

.dockerignore

scaffold: project skeleton, schema, healthz/readyz, CI

2026-05-26 20:50:35 +02:00

.env.example

proxy: multi-backend Ollama aggregation with per-model routing + failover

2026-05-27 22:30:26 +02:00

.gitignore

scaffold: project skeleton, schema, healthz/readyz, CI

2026-05-26 20:50:35 +02:00

alembic.ini

scaffold: project skeleton, schema, healthz/readyz, CI

2026-05-26 20:50:35 +02:00

demo.sh

demo + playground + docs

2026-05-26 20:52:33 +02:00

docker-compose.demo.yml

demo + playground + docs

2026-05-26 20:52:33 +02:00

docker-compose.dev.yml

scaffold: project skeleton, schema, healthz/readyz, CI

2026-05-26 20:50:35 +02:00

docker-compose.yml

compose: declare ollama_data as external to silence adoption warning

2026-05-27 22:34:45 +02:00

Dockerfile

scaffold: project skeleton, schema, healthz/readyz, CI

2026-05-26 20:50:35 +02:00

justfile

scaffold: project skeleton, schema, healthz/readyz, CI

2026-05-26 20:50:35 +02:00

LICENSE

scaffold: project skeleton, schema, healthz/readyz, CI

2026-05-26 20:50:35 +02:00

mkdocs.yml

demo + playground + docs

2026-05-26 20:52:33 +02:00

pyproject.toml

cli: add add-backend / remove-backend / list-backends commands

2026-05-27 22:59:53 +02:00

README.md

scaffold: project skeleton, schema, healthz/readyz, CI

2026-05-26 20:50:35 +02:00

README.md

neuronetz-gateway

A secure, multi-tenant API gateway in front of an Ollama instance. It is the hot path of the Neuronetz API: every request to the models flows through here, authenticated, rate-limited, budgeted, and audited.

The Ollama backend is never reachable from the public internet. It is bound to an internal Docker network with no published ports. All access is via this gateway, behind TLS terminated by Caddy.

Status: v0.1.0 — in development. See scope-docs/SPEC.md for the full specification and scope-docs/AGENT_PROMPT.md for the phased build plan. SPEC.md is the source of truth.

What it does

Auth — API keys as Bearer tokens, stored as Argon2id hashes, verified in constant time.
Multi-tenant — tenants own keys; limits and budgets inherit tenant → key.
Rate limiting — per-key and per-tenant RPM / TPM / concurrent connections.
Budgets — daily / monthly / total token budgets, enforced fail-closed.
Dual API surface — native Ollama (/api/*) and OpenAI-compatible (/v1/*), both streaming.
Hard-blocked mutations — /api/pull, /api/push, /api/create, /api/copy, /api/delete, /api/blobs/* always return 403. Not configurable.
Audit log — always-on request metadata; opt-in, TTL'd prompt logging per key.

Administration (dashboards, tenant self-service) lives in a separate service, neuronetz-console; it is not part of this repository.

Architecture

Internet ──TLS──> Caddy ──HTTP──> gateway ──┬──> Postgres   (keys, budgets, audit)
                                            ├──> Redis      (key cache, rate limits)
                                            └──> Ollama     (internal network only)

Quickstart (dev)

Requires Docker + Docker Compose. The dev stack runs Postgres, Redis, and the gateway — no Caddy and no Ollama (so /readyz reports 503 until a real Ollama backend is wired in; that is expected).

git clone <repo> neuronetz-gateway && cd neuronetz-gateway
cp .env.example .env          # adjust if you like; defaults work for local dev
docker compose -f docker-compose.dev.yml up --build

The gateway runs alembic upgrade head on startup, then serves on http://localhost:8080.

curl -i http://localhost:8080/healthz   # -> 200  {"status":"ok"}
curl -i http://localhost:8080/readyz    # -> 503  (no Ollama backend in the dev stack)

Production

docker-compose.yml brings up the full stack — Caddy (TLS via Let's Encrypt for api.neuronetz.ai), the gateway, Postgres, Redis, and Ollama. The ollama service has no ports: mapping and is reachable only on the internal Docker network. See docs/DEPLOYMENT.md (added in a later phase) and ops/caddy/Caddyfile.example.

Managing tenants and keys

Use the bootstrap CLI (Typer). Keys have the form nz_<prefix><secret>; the full key is printed exactly once at creation and only its Argon2id hash is stored.

neuronetz-gateway create-tenant --name acme
neuronetz-gateway create-key   --tenant acme --name prod-server-1
neuronetz-gateway list-keys    --tenant acme
neuronetz-gateway revoke-key   --prefix nz_abc12345

Development

just dev          # run the dev stack
just test         # pytest + coverage
just lint         # ruff
just typecheck    # mypy --strict
just migrate      # alembic upgrade head

Tooling: Python 3.12, uv, FastAPI + uvicorn, SQLAlchemy 2.0 (async) + asyncpg, Redis, httpx, structlog, Pydantic. Lint/type/security gates: ruff, mypy --strict, bandit, pip-audit.

License

Apache 2.0 — see LICENSE. Owner: Stephan Berbig / Neuronetz.

Languages

Python 86.2%

HTML 8.1%

Shell 4.4%

Dockerfile 0.9%

Just 0.4%