Data Plane
- Configure `INFERENCE_GATEWAY_BASE_URL` to your OpenAI-compatible GPU gateway.
- Set `INFERENCE_GATEWAY_API_KEY` from secret manager; never hardcode credentials.
- Tune `INFERENCE_TIMEOUT_MS` per provider latency profile.
- Define model fallback policy (`timesfm -> chronos -> ttm`) for partial outages.