2 Commits

Author SHA1 Message Date
Stephan Berbig
c9e11c3486 cli: add add-backend / remove-backend / list-backends commands
Some checks failed
CI / ruff (push) Has been cancelled
CI / mypy --strict (push) Has been cancelled
CI / pytest (push) Has been cancelled
CI / bandit (push) Has been cancelled
CI / pip-audit (push) Has been cancelled
So nobody ever has to hand-write the OLLAMA_BACKENDS JSON again.

  # add a backend, probe it, print the resulting .env line:
  neuronetz-gateway add-backend embedded     http://ollama:11434
  neuronetz-gateway add-backend neuro-ollama http://neuro-ollama:11434 --token ABC

  # update one (e.g. rotate token):
  neuronetz-gateway add-backend neuro-ollama http://neuro-ollama:11434 --token XYZ --replace

  # remove:
  neuronetz-gateway remove-backend neuro-ollama

  # peek (tokens redacted):
  neuronetz-gateway list-backends

  # write directly to a .env file (atomic temp-file + rename):
  neuronetz-gateway add-backend foo http://foo:11434 --token T --write-env /app/.env

  # show what would change without doing it:
  neuronetz-gateway add-backend foo http://foo:11434 --token T --dry-run

What each command does:

- `add-backend NAME URL` (+ optional --token / --header / --scheme / --replace
  / --no-validate / --write-env / --dry-run): builds a new backend list (current
  list parsed from OLLAMA_BACKENDS env, or synthesized from the single-backend
  fallback if unset), validates the new backend by probing /api/tags with the
  same headers the gateway will use at runtime (`build_backend_headers`), then
  prints the resulting OLLAMA_BACKENDS=... line ready to paste — or writes it
  in place if --write-env is given. Refuses to overwrite an existing name
  unless --replace is passed.
- `remove-backend NAME` (+ --write-env / --dry-run): mirror of add-backend for
  removal.
- `list-backends`: shows the configured backends with tokens redacted to
  "***" via `redacted_dump`. Useful sanity check after editing .env.

All the JSON manipulation is in a new pure-helpers module
`cli/backends.py` (parse / serialize / add_or_replace / remove /
update_env_file). The Typer commands in `cli/manage.py` are thin shells
on top — the logic is unit-tested directly without spinning up Typer or
the network. The token is unwrapped from SecretStr exactly once at the
serialization boundary (`to_dict`) and never logged.

New tests (16): full coverage of the helpers — round-trip serialize/parse,
duplicate-name rejection, replace-in-place order preservation, remove
on unknown name, redaction, atomic env-file rewrite (insert / replace /
idempotent re-apply / create-when-missing).

ruff (incl. the per-file ignore add for tests' S105/S106 — placeholder
"tok123"-style strings are inputs, not credentials) + mypy --strict (68
source files) clean. pytest: 76 passed + 39 skipped (the 16 new tests +
no regressions on the existing 60).
2026-05-27 22:59:53 +02:00
Stephan Berbig
844b02aade tests: unit + integration suite (99 tests; ruff + mypy --strict clean)
Real test bodies (not stubs), driven against an in-process httpx.ASGITransport
override of the gateway's get_ollama_client dependency pointing at
tests/integration/mock_ollama.py.

Unit (target 100% on auth/, ratelimit/, budget/):
- argon2id roundtrip, wrong-key, garbage encoding, needs_rehash on param change
- key format/uniqueness/prefix extraction
- token counter (prompt_eval_count + eval_count, embeddings, missing-counts)
- translate (OpenAI <-> Ollama for chat/completion/embeddings, streaming chunks,
  /v1/models list shape)
- allowlist (hard-blocks, effective-set semantics across allow_all/inheritance/
  empty-discovered)
- discovery (parse, cache roundtrip with TTL, fail-closed, tolerates redis=None)
- sliding window (allow/block/reset/per-key vs per-tenant/cost-weighted)

Integration (testcontainers postgres + redis + in-process mock Ollama):
- auth flow (no/malformed/wrong key all return identical sanitized 401)
- proxy stream (NDJSON roundtrip, audit row's token counts match, hard-blocked
  endpoints uniformly 403)
- openai_compat (SSE chunks, data: [DONE], non-stream shape, /v1/models)
- model_discovery (allow_all sees all, default-deny sees allowed ∩ discovered,
  /v1/models filtered, unpermitted-but-installed = nonexistent = 403,
  empty cache denies even allow_all)
- rate_limit (429 + Retry-After + headers; Redis down ⇒ 503, never 200)
- budget (decrement + headers; pre-burned counter blocks next request)
- revocation (INSERT into gateway.revocations → NOTIFY → cache evicted → 401 ≤ 1s)

Includes a known-issue xfail flagging a bug in ratelimit/sliding_window.py:
the per-hit ZSET member uses id(object()) which returns the same id on
consecutive calls, causing same-millisecond hits to overwrite instead of
stacking. To be fixed in a follow-up commit.
2026-05-26 20:52:33 +02:00