deploy: upstream Ollama auth token + adoptable data volumes

Two production-hardening changes triggered by real issues found on the first prod attempt against neuronetz-ai-01. 1. Upstream auth (the production Ollama is fronted by an auth proxy): - New config: OLLAMA_AUTH_TOKEN (pydantic SecretStr — never appears in repr/logs/errors), plus OLLAMA_AUTH_HEADER (default "Authorization") and OLLAMA_AUTH_SCHEME (default "Bearer") for stacks that expect a non-standard header like X-API-Key. - lifespan._build_upstream_headers() injects the configured header into the single shared httpx client used by both the proxy hot path AND the discovery poller, so /api/tags + /api/chat both authenticate against the upstream automatically. - New CLI: `neuronetz-gateway probe-ollama` — uses the same client config to GET /api/version and /api/tags, reports success/transport- error/HTTP-status, lists the first few discovered models, exits 1 on any failure. The token itself is never printed (only whether one was attached). Lets ops verify upstream reachability before letting real traffic through. - docker-compose.yml passes OLLAMA_AUTH_TOKEN/HEADER/SCHEME through; .env.example documents them with a leave-blank-for-internal-Ollama default. 2. Volume adoption (don't lose existing model data on re-deploy): - docker-compose.yml now pins absolute Docker volume NAMES for both postgres_data and ollama_data, configurable via POSTGRES_DATA_VOLUME and OLLAMA_DATA_VOLUME. Defaults preserve the previous per-project names so existing deployments aren't disturbed. - This addresses the scenario where deploying this compose under a new project directory created fresh, empty volumes alongside an existing `neuro-ollama_ollama-data` volume containing pre-pulled models (incl. deepseek-r1:14b, qwen2.5:14b, gemma3:12b, ...). Setting OLLAMA_DATA_VOLUME=neuro-ollama_ollama-data in .env tells the new stack to mount the existing volume in place — no copy, no downtime. - .env.example documents the override with the exact host's volume name as an example. Both changes are ruff + mypy --strict clean.
2026-05-27 18:59:09 +02:00
parent b2ec32c852
commit 662fbfb442
5 changed files with 162 additions and 3 deletions
--- a/.env.example
+++ b/.env.example
@@ -18,11 +18,33 @@ GATEWAY_TRUSTED_PROXIES=127.0.0.1,nginx-proxy  # for X-Forwarded-For
 GATEWAY_VIRTUAL_HOST=api.neuronetz.ai
 LETSENCRYPT_EMAIL=admin@neuronetz.ai

+# ──────────────────────── Volume adoption ────────────────────────
+# Override the Docker volume names if an EXISTING volume on the host holds
+# data this stack should adopt (e.g. models pulled by a previous Ollama
+# deployment). Leave unset to use the default per-project names.
+#
+# Example (matches the neuronetz-ai-01 host):
+#   OLLAMA_DATA_VOLUME=neuro-ollama_ollama-data
+#   POSTGRES_DATA_VOLUME=neuro-gateway_postgres_data
+OLLAMA_DATA_VOLUME=
+POSTGRES_DATA_VOLUME=
+
 # ──────────────────────────── Upstream ───────────────────────────
 OLLAMA_BASE_URL=http://ollama:11434
 OLLAMA_CONNECT_TIMEOUT_S=5
 OLLAMA_READ_TIMEOUT_S=600
 OLLAMA_MAX_CONNECTIONS=64
+# If you front Ollama with an auth proxy (e.g. an external host like
+# https://ollama.neuronetz.ai requiring a Bearer token), set the token here.
+# The value never appears in logs/errors — it's wrapped in pydantic SecretStr.
+# Leave empty to send no Authorization header (the default for an in-stack
+# ollama service on the private Docker network).
+OLLAMA_AUTH_TOKEN=
+# Override only if your auth proxy expects a non-standard header. For
+# Authorization the scheme prefix (default: Bearer) is included; for any other
+# header name the raw token is sent.
+OLLAMA_AUTH_HEADER=Authorization
+OLLAMA_AUTH_SCHEME=Bearer

 # ──────────────────────── Model discovery (§4.6) ─────────────────
 MODEL_DISCOVERY_REFRESH_S=60