Forecasting

Forecasting API

The canonical request and response contract for `/v1/forecast`, including covariates, timestamps, quantiles, batch jobs, and ensemble routing.

Canonical request

Standardize on the canonical shape { model, inputs, parameters } for all new integrations. Compatibility aliases remain accepted for migration, but the canonical form is what the playground, SDK examples, CLI inspection output, and the reference docs all optimize around.

forecast.request.json
{
  "model": "google/timesfm-2.0-500m",
  "inputs": [{
    "item_id": "sku_42",
    "start": "2026-03-01T00:00:00Z",
    "timestamps": [
      "2026-03-01T00:00:00Z",
      "2026-03-02T00:00:00Z",
      "2026-03-03T00:00:00Z",
      "2026-03-04T00:00:00Z",
      "2026-03-05T00:00:00Z",
      "2026-03-06T00:00:00Z"
    ],
    "target": [110, 113, 108, 116, 119, 121],
    "past_covariates": {
      "promo": [0, 1, 0, 1, 1, 0]
    },
    "future_covariates": {
      "promo": [1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0]
    },
    "static_covariates": {
      "store_cluster": 2,
      "shelf_capacity": 1
    }
  }],
  "parameters": {
    "prediction_length": 12,
    "freq": "D",
    "quantiles": [0.1, 0.5, 0.9],
    "context_length": 256
  },
  "metadata": {
    "surface": "docs.forecast"
  }
}
forecast.response.json
{
  "id": "6a6e8f62-7f6e-43e5-a33a-64279440cb90",
  "object": "forecast",
  "created": 1773369600,
  "model": "google/timesfm-2.0-500m",
  "provider": "harpoon",
  "horizon": 12,
  "prediction_length": 12,
  "quantile_levels": [0.1, 0.5, 0.9],
  "input_points": 6,
  "predictions": [{
    "item_id": "sku_42",
    "mean": [122.0, 123.3, 124.1, 124.8, 125.4, 126.0, 126.8, 127.4, 128.0, 128.9, 129.4, 129.9],
    "quantiles": {
      "0.1": [119.4, 119.8, 120.6, 121.0, 121.7, 122.1, 122.6, 123.0, 123.6, 124.0, 124.5, 124.8],
      "0.5": [122.0, 123.3, 124.1, 124.8, 125.4, 126.0, 126.8, 127.4, 128.0, 128.9, 129.4, 129.9],
      "0.9": [125.2, 126.9, 127.7, 128.8, 129.6, 130.2, 131.3, 132.0, 132.9, 133.6, 134.2, 134.8]
    },
    "timestamps": [
      "2026-03-07T00:00:00Z",
      "2026-03-08T00:00:00Z",
      "2026-03-09T00:00:00Z",
      "2026-03-10T00:00:00Z",
      "2026-03-11T00:00:00Z",
      "2026-03-12T00:00:00Z",
      "2026-03-13T00:00:00Z",
      "2026-03-14T00:00:00Z",
      "2026-03-15T00:00:00Z",
      "2026-03-16T00:00:00Z",
      "2026-03-17T00:00:00Z",
      "2026-03-18T00:00:00Z"
    ]
  }],
  "usage": {
    "input_tokens": 188,
    "output_tokens": 36,
    "total_tokens": 224
  },
  "latency_ms": 51,
  "metadata": {
    "surface": "docs.forecast"
  }
}

Request semantics

Be explicit about timeline and feature alignment

Most integration bugs are not caused by the top-level JSON shape. They happen when clients disagree with the service about cadence, timestamp generation, or how covariates line up against the target history and future horizon.

Timeline rules

FieldTypeRequiredDescription
`start` + `target` + `parameters.freq`regular cadenceConditionalUse this when observations arrive on a clean interval such as daily or hourly. The API can derive future timestamps from the last observed point and freq.
`timestamps` + `target`explicit alignmentConditionalUse explicit timestamps when history is irregular or when you need to preserve exact observed dates instead of relying on inferred cadence.
`parameters.prediction_length`integer >= 1YesDefines the horizon length. It also becomes the minimum number of forward-looking values you should plan for when you send future covariates.
`parameters.freq`stringConditionalStrongly recommended whenever you expect returned forecast timestamps. Without it, clients should not assume a calendar interval from the response alone.

Covariate rules

FieldTypeRequiredDescription
`past_covariates`dict<string, number[]>NoHistorical covariate arrays should align to the observed target history. Treat them like parallel feature columns for the same past window.
`future_covariates`dict<string, number[]>NoForward-looking covariates should cover the forecast horizon. Keep the feature names stable across series so downstream evaluation stays interpretable.
`static_covariates`dict<string, number>NoSeries-level scalar attributes such as store cluster or asset class. Use them when the model family supports static context.
`item_id`stringNoOptional in the request, but strongly recommended. The response uses it as the clean join key for result storage, charts, and retries.
forecast-minimal.request.json
{
  "model": "amazon/chronos-bolt-small",
  "inputs": [{
    "item_id": "store_017",
    "start": "2026-02-01T00:00:00Z",
    "target": [428, 435, 441, 438, 446, 452, 460, 458, 466, 472]
  }],
  "parameters": {
    "prediction_length": 7,
    "freq": "D",
    "quantiles": [0.1, 0.5, 0.9]
  }
}

Request fields

Build the payload from three layers

Top-level request fields

FieldTypeRequiredDescription
modelstringYesCatalog model id, e.g. amazon/chronos-bolt-small. See /api/models for the full list.
inputsarray<object>YesOne or more series payloads. Preferred canonical shape for TSFM requests.
parametersobjectNoForecast controls (prediction_length, freq, quantiles) and model knobs.
metadataobjectNoOpaque passthrough metadata for client traceability.

Series objects in `inputs[]`

FieldTypeRequiredDescription
item_idstringNoSeries identifier echoed in response for joins and merges.
startstring (ISO date/time)NoAnchor timestamp for generated forecast timeline.
timestampsstring[]NoExplicit timestamps for each historical observation when the series is irregular or not inferable from start + freq.
targetnumber[]YesHistorical observations used as model context.
past_covariatesrecord<string, number[]>NoObserved exogenous signals aligned with target history.
future_covariatesrecord<string, number[]>NoKnown future exogenous signals aligned with horizon.
static_covariatesrecord<string, number>NoPer-series scalar features such as store cluster, region, or product family.

Forecast parameters

FieldTypeRequiredDescription
prediction_lengthintegerNoNumber of future points to generate. Defaults to 24 in the backend request model.
freqstringNoSampling frequency (D, H, 15min, etc.) used for timeline interpolation.
quantilesnumber[]NoRequested quantiles in ascending order, typically [0.1, 0.5, 0.9].
context_lengthintegerNoOptional cap on the number of historical observations passed to the model.
sensitivitynumberConditionalAnomaly threshold for /v1/detect-anomalies (higher means fewer anomalies).
temperaturenumberNoSampling temperature for generative models (e.g. Chronos). Higher values increase forecast diversity. Default varies by model.
top_pnumberNoNucleus sampling parameter for generative models. Controls the cumulative probability threshold for token selection.

Implementation notes

  • Use `inputs[]` even for a single series. It keeps your payload shape stable when you later add batching or multiple item ids.
  • Always send `freq` when timestamps matter. It controls forecast timeline interpolation and downstream charting behavior.
  • Request `quantiles` explicitly if your downstream consumers need uncertainty bands or interval-based business logic.
  • Treat `usage` and `latency_ms` as first-class telemetry and persist them next to request ids in production.

Response fields

Read the result like a production client

The response includes forecast outputs and the operational telemetry you should log. Most downstream bugs happen when teams read the forecast line but ignore timing, usage, or item ids.

Response shape

FieldTypeRequiredDescription
id / object / createdstring / string / integerYesStable envelope fields for logging, tracing, and replay.
horizon / prediction_lengthintegerYesRequested forecast length echoed in normalized response form.
quantile_levelsnumber[]YesThe quantile levels returned in predictions[].quantiles.
predictionsarray<object>YesForecast outputs per series with mean, quantiles, and optional timestamps.
usageobjectYesToken accounting used by usage/billing dashboards.
latency_msnumberYesObserved server latency for the request.
metadataobjectNoOpaque client metadata echoed back for request correlation.
truncated_fromintegerNoOriginal number of input points before truncation. Present only when the input exceeded the model's maximum context length and was automatically truncated (most recent values kept).
max_context_lengthintegerNoMaximum context length for this model on its GPU tier. Present only when truncation occurred.

Validation failures

The mistakes that usually produce 4xx responses

When a request fails, verify the contract before you retry. The most common issues are missing horizon parameters, timeline ambiguity, misaligned covariates, and ensemble requests that are too short to score.

Missing or malformed `prediction_length`

A forecast call is not actionable without a horizon. Keep `parameters.prediction_length` present and greater than zero on every request, even in local smoke tests.

Timeline assumptions without `freq`

If downstream code expects returned timestamps, send `freq` explicitly. Otherwise a numerically correct forecast can still be unusable in charts, joins, or schedule outputs.

Covariate arrays that do not align to history or horizon

Treat covariates as synchronized feature columns. Past covariates should align to observed history; future covariates should cover the forecast horizon you asked for.

Ensemble requests with series shorter than the held-out horizon

Ensemble selection performs an internal backtest. At least one input series must be longer than `prediction_length`, otherwise the route returns a validation error.

validation-error.example.json
{
  "detail": [
    {
      "loc": ["body", "parameters", "prediction_length"],
      "msg": "Field required",
      "type": "missing"
    }
  ]
}

Common patterns

Expand from a single forecast without changing mental models

Once the base contract is stable, you can add richer patterns such as batch execution or ensemble selection. Those patterns still build on the same canonical series objects and parameter names.

forecast-batch.request.json
{
  "requests": [
    {
      "model": "amazon/chronos-bolt-small",
      "inputs": [{ "item_id": "north", "target": [10, 11, 12, 13, 14] }],
      "parameters": { "prediction_length": 6, "freq": "D" }
    },
    {
      "model": "google/timesfm-2.5-200m-pytorch",
      "inputs": [{ "item_id": "south", "target": [6, 7, 8, 9, 10] }],
      "parameters": { "prediction_length": 6, "freq": "D" }
    }
  ]
}
forecast-batch.response.json
{
  "object": "list",
  "data": [
    {
      "ok": true,
      "result": {
        "object": "forecast",
        "model": "amazon/chronos-bolt-small",
        "prediction_length": 6
      }
    },
    {
      "ok": false,
      "error": {
        "error": "Model access denied",
        "code": "PERMISSION_DENIED",
        "endpoint": "/v1/forecast"
      }
    }
  ]
}