Two production-hardening changes triggered by real issues found on the
first prod attempt against neuronetz-ai-01.
1. Upstream auth (the production Ollama is fronted by an auth proxy):
- New config: OLLAMA_AUTH_TOKEN (pydantic SecretStr — never appears in
repr/logs/errors), plus OLLAMA_AUTH_HEADER (default "Authorization")
and OLLAMA_AUTH_SCHEME (default "Bearer") for stacks that expect a
non-standard header like X-API-Key.
- lifespan._build_upstream_headers() injects the configured header into
the single shared httpx client used by both the proxy hot path AND
the discovery poller, so /api/tags + /api/chat both authenticate
against the upstream automatically.
- New CLI: `neuronetz-gateway probe-ollama` — uses the same client
config to GET /api/version and /api/tags, reports success/transport-
error/HTTP-status, lists the first few discovered models, exits 1 on
any failure. The token itself is never printed (only whether one
was attached). Lets ops verify upstream reachability before letting
real traffic through.
- docker-compose.yml passes OLLAMA_AUTH_TOKEN/HEADER/SCHEME through;
.env.example documents them with a leave-blank-for-internal-Ollama
default.
2. Volume adoption (don't lose existing model data on re-deploy):
- docker-compose.yml now pins absolute Docker volume NAMES for both
postgres_data and ollama_data, configurable via POSTGRES_DATA_VOLUME
and OLLAMA_DATA_VOLUME. Defaults preserve the previous per-project
names so existing deployments aren't disturbed.
- This addresses the scenario where deploying this compose under a new
project directory created fresh, empty volumes alongside an existing
`neuro-ollama_ollama-data` volume containing pre-pulled models (incl.
deepseek-r1:14b, qwen2.5:14b, gemma3:12b, ...). Setting
OLLAMA_DATA_VOLUME=neuro-ollama_ollama-data in .env tells the new
stack to mount the existing volume in place — no copy, no downtime.
- .env.example documents the override with the exact host's volume name
as an example.
Both changes are ruff + mypy --strict clean.
93 lines
4.6 KiB
Plaintext
93 lines
4.6 KiB
Plaintext
# neuronetz-gateway — environment configuration (SPEC §7).
|
|
#
|
|
# Copy to `.env` and adjust. `.env` is gitignored and MUST NOT be committed.
|
|
# All values here are SAFE EXAMPLES — change every secret before any real deploy.
|
|
|
|
# ──────────────────────────── Service ────────────────────────────
|
|
GATEWAY_BIND_HOST=0.0.0.0
|
|
GATEWAY_BIND_PORT=8080
|
|
GATEWAY_LOG_LEVEL=INFO
|
|
GATEWAY_LOG_FORMAT=json # json|console
|
|
GATEWAY_REQUEST_ID_HEADER=X-Request-ID
|
|
GATEWAY_TRUSTED_PROXIES=127.0.0.1,nginx-proxy # for X-Forwarded-For
|
|
|
|
# ──────────── Public hostname (jwilder-proxy / acme-companion) ───────
|
|
# These are consumed by docker-compose.yml's gateway service so that the
|
|
# host's nginx-proxy stack routes TLS-terminated traffic for your domain.
|
|
# Mirrors the pattern used by neuro-landing.
|
|
GATEWAY_VIRTUAL_HOST=api.neuronetz.ai
|
|
LETSENCRYPT_EMAIL=admin@neuronetz.ai
|
|
|
|
# ──────────────────────── Volume adoption ────────────────────────
|
|
# Override the Docker volume names if an EXISTING volume on the host holds
|
|
# data this stack should adopt (e.g. models pulled by a previous Ollama
|
|
# deployment). Leave unset to use the default per-project names.
|
|
#
|
|
# Example (matches the neuronetz-ai-01 host):
|
|
# OLLAMA_DATA_VOLUME=neuro-ollama_ollama-data
|
|
# POSTGRES_DATA_VOLUME=neuro-gateway_postgres_data
|
|
OLLAMA_DATA_VOLUME=
|
|
POSTGRES_DATA_VOLUME=
|
|
|
|
# ──────────────────────────── Upstream ───────────────────────────
|
|
OLLAMA_BASE_URL=http://ollama:11434
|
|
OLLAMA_CONNECT_TIMEOUT_S=5
|
|
OLLAMA_READ_TIMEOUT_S=600
|
|
OLLAMA_MAX_CONNECTIONS=64
|
|
# If you front Ollama with an auth proxy (e.g. an external host like
|
|
# https://ollama.neuronetz.ai requiring a Bearer token), set the token here.
|
|
# The value never appears in logs/errors — it's wrapped in pydantic SecretStr.
|
|
# Leave empty to send no Authorization header (the default for an in-stack
|
|
# ollama service on the private Docker network).
|
|
OLLAMA_AUTH_TOKEN=
|
|
# Override only if your auth proxy expects a non-standard header. For
|
|
# Authorization the scheme prefix (default: Bearer) is included; for any other
|
|
# header name the raw token is sent.
|
|
OLLAMA_AUTH_HEADER=Authorization
|
|
OLLAMA_AUTH_SCHEME=Bearer
|
|
|
|
# ──────────────────────── Model discovery (§4.6) ─────────────────
|
|
MODEL_DISCOVERY_REFRESH_S=60
|
|
MODEL_DISCOVERY_CACHE_TTL_S=120
|
|
|
|
# ──────────────────────────── Database ───────────────────────────
|
|
# Compose builds DATABASE_URL from the POSTGRES_* parts below, but the gateway
|
|
# also accepts a full DATABASE_URL directly.
|
|
DATABASE_URL=postgresql+asyncpg://gateway:changeme@postgres:5432/neuronetz
|
|
DATABASE_POOL_SIZE=10
|
|
DATABASE_POOL_OVERFLOW=20
|
|
|
|
# Postgres container credentials (consumed by docker-compose).
|
|
POSTGRES_USER=gateway
|
|
POSTGRES_PASSWORD=changeme
|
|
POSTGRES_DB=neuronetz
|
|
|
|
# ──────────────────────────── Redis ──────────────────────────────
|
|
REDIS_URL=redis://redis:6379/0
|
|
REDIS_KEY_CACHE_TTL_S=60
|
|
|
|
# ────────────────── Limits (defaults; DB overrides) ──────────────
|
|
DEFAULT_RPM=60
|
|
DEFAULT_TPM=100000
|
|
DEFAULT_CONCURRENT=8
|
|
MAX_REQUEST_BODY_BYTES=262144
|
|
MAX_NUM_PREDICT=4096
|
|
|
|
# ──────────────────────────── Security ───────────────────────────
|
|
ARGON2_TIME_COST=3
|
|
ARGON2_MEMORY_COST_KIB=65536
|
|
ARGON2_PARALLELISM=4
|
|
AUTH_FAILURE_RATE_LIMIT_PER_IP_PER_MIN=20
|
|
|
|
# ──────────────────────────── Audit ──────────────────────────────
|
|
AUDIT_BUFFER_SIZE=1000
|
|
PROMPT_LOG_DEFAULT_RETENTION_DAYS=30
|
|
AUDIT_LOG_DEFAULT_RETENTION_DAYS=365
|
|
|
|
# ──────────────── Playground / API docs (prod-safe: OFF) ─────────
|
|
# Serve the playground HTML (owned by the docs agent) at /playground.
|
|
PLAYGROUND_ENABLED=false
|
|
PLAYGROUND_FILE=/app/playground/index.html
|
|
# Enable FastAPI's /docs + /openapi.json (default off in production).
|
|
DOCS_ENABLED=false
|