deploy: upstream Ollama auth token + adoptable data volumes

Two production-hardening changes triggered by real issues found on the first prod attempt against neuronetz-ai-01. 1. Upstream auth (the production Ollama is fronted by an auth proxy): - New config: OLLAMA_AUTH_TOKEN (pydantic SecretStr — never appears in repr/logs/errors), plus OLLAMA_AUTH_HEADER (default "Authorization") and OLLAMA_AUTH_SCHEME (default "Bearer") for stacks that expect a non-standard header like X-API-Key. - lifespan._build_upstream_headers() injects the configured header into the single shared httpx client used by both the proxy hot path AND the discovery poller, so /api/tags + /api/chat both authenticate against the upstream automatically. - New CLI: `neuronetz-gateway probe-ollama` — uses the same client config to GET /api/version and /api/tags, reports success/transport- error/HTTP-status, lists the first few discovered models, exits 1 on any failure. The token itself is never printed (only whether one was attached). Lets ops verify upstream reachability before letting real traffic through. - docker-compose.yml passes OLLAMA_AUTH_TOKEN/HEADER/SCHEME through; .env.example documents them with a leave-blank-for-internal-Ollama default. 2. Volume adoption (don't lose existing model data on re-deploy): - docker-compose.yml now pins absolute Docker volume NAMES for both postgres_data and ollama_data, configurable via POSTGRES_DATA_VOLUME and OLLAMA_DATA_VOLUME. Defaults preserve the previous per-project names so existing deployments aren't disturbed. - This addresses the scenario where deploying this compose under a new project directory created fresh, empty volumes alongside an existing `neuro-ollama_ollama-data` volume containing pre-pulled models (incl. deepseek-r1:14b, qwen2.5:14b, gemma3:12b, ...). Setting OLLAMA_DATA_VOLUME=neuro-ollama_ollama-data in .env tells the new stack to mount the existing volume in place — no copy, no downtime. - .env.example documents the override with the exact host's volume name as an example. Both changes are ruff + mypy --strict clean.
2026-05-27 18:59:09 +02:00
parent b2ec32c852
commit 662fbfb442
5 changed files with 162 additions and 3 deletions
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -62,10 +62,15 @@ services:
      DATABASE_POOL_OVERFLOW: ${DATABASE_POOL_OVERFLOW:-20}
      REDIS_URL: redis://redis:6379/0
      REDIS_KEY_CACHE_TTL_S: ${REDIS_KEY_CACHE_TTL_S:-60}
-      OLLAMA_BASE_URL: http://ollama:11434
+      OLLAMA_BASE_URL: ${OLLAMA_BASE_URL:-http://ollama:11434}
      OLLAMA_CONNECT_TIMEOUT_S: ${OLLAMA_CONNECT_TIMEOUT_S:-5}
      OLLAMA_READ_TIMEOUT_S: ${OLLAMA_READ_TIMEOUT_S:-600}
      OLLAMA_MAX_CONNECTIONS: ${OLLAMA_MAX_CONNECTIONS:-64}
+      # Optional Bearer token for an externally-fronted Ollama (default empty:
+      # the in-stack ollama service needs no auth on the private network).
+      OLLAMA_AUTH_TOKEN: ${OLLAMA_AUTH_TOKEN:-}
+      OLLAMA_AUTH_HEADER: ${OLLAMA_AUTH_HEADER:-Authorization}
+      OLLAMA_AUTH_SCHEME: ${OLLAMA_AUTH_SCHEME:-Bearer}
      MODEL_DISCOVERY_REFRESH_S: ${MODEL_DISCOVERY_REFRESH_S:-60}
      MODEL_DISCOVERY_CACHE_TTL_S: ${MODEL_DISCOVERY_CACHE_TTL_S:-120}
      DEFAULT_RPM: ${DEFAULT_RPM:-60}
@@ -159,5 +164,18 @@ networks:
    driver: bridge

 volumes:
+  # Pin absolute volume NAMES so the stack can ADOPT an existing volume that was
+  # created by a previous deployment under a different compose project. Without
+  # an explicit `name:`, compose namespaces volumes by project (directory) name,
+  # so a rename or re-clone silently creates fresh, empty volumes alongside the
+  # old data. We hit that the first time this stack was deployed (the original
+  # models lived in `neuro-ollama_ollama-data` and a fresh `neuro-gateway_
+  # ollama_data` was created next to them, leaving the models orphaned).
+  #
+  # Override via .env if your existing volumes are named differently:
+  #   POSTGRES_DATA_VOLUME=neuro-api_postgres-data
+  #   OLLAMA_DATA_VOLUME=neuro-ollama_ollama-data
  postgres_data:
+    name: ${POSTGRES_DATA_VOLUME:-neuro-gateway_postgres_data}
  ollama_data:
+    name: ${OLLAMA_DATA_VOLUME:-neuro-gateway_ollama_data}