Models

Models and discovery

Discover models, inspect capabilities, and decide when to route by latency, cost, context length, or task support.

Browse live catalog Read selection guide

Catalog endpoints

The fastest discovery loop is usually: list models, shortlist by capability or family, inspect one or two details, then test the top candidates on real data. The hosted catalog UI at `/models` is a friendly wrapper around these same catalog surfaces.

Discovery endpoints

Field	Type	Required	Description
GET /api/models	listModels	No	Browse the hosted catalog with filters for provider, capability, status, or free-text search.
GET /api/models/{modelId}	getModelById	No	Inspect one model in detail, including capabilities, latency signals, hosting metadata, and fit profile guidance.
GET /v1/models	listOpenAiModels	No	OpenAI-compatible model list for clients that expect the `/v1/models` surface.
GET /v1/models/{modelId}	getOpenAiModelById	No	OpenAI-compatible detail endpoint for one hosted model id.

Query patterns

List first, then inspect detail

The list endpoint is for narrowing the field. The detail endpoint is where you confirm operational metadata such as capabilities, latency, pricing, and fit profile before you promote a model into routing policy.

List endpoint filters

Field	Type	Required	Description
q	string	No	Free-text search over model ids and descriptive metadata. Best starting point when you know the family name, such as `chronos` or `moirai`.
capability	string	No	Filter for hosted capabilities such as `covariates`, `quantile-forecasting`, or `long-context`.
status	string	No	Operational filter for states such as `online` so your selection flow ignores inactive entries.
provider	string	No	Narrow to one provider namespace when you are comparing variants within a family or region.

list-models.sh

curl -s "https://api.tsfm.ai/api/models?q=chronos" | jq '.models | map({
  id,
  supported_tasks,
  avg_latency_ms
})'

list-models-filtered.sh

curl -s "https://api.tsfm.ai/api/models?capability=covariates&status=online&q=chronos" \
  | jq '.models[] | {
      id,
      supported_tasks,
      capabilities,
      avg_latency_ms,
      context_length
    }'

models.response.json

{
  "models": [
    {
      "id": "amazon/chronos-bolt-base",
      "provider": "tsfm (us)",
      "supported_tasks": ["forecast"],
      "capabilities": ["forecasting", "quantile-forecasting", "multivariate", "covariates"],
      "avg_latency_ms": 240,
      "context_length": 512,
      "max_context_length": 512,
      "min_points": 4,
      "availability_pct": 99.95
    },
    {
      "id": "amazon/chronos-bolt-small",
      "provider": "tsfm (us)",
      "supported_tasks": ["forecast"],
      "capabilities": ["forecasting", "quantile-forecasting", "high-throughput"],
      "avg_latency_ms": 120,
      "context_length": 512,
      "max_context_length": 512,
      "min_points": 4,
      "availability_pct": 99.97
    }
  ]
}

inspect-model.sh

curl -s https://api.tsfm.ai/api/models/amazon%2Fchronos-bolt-base \
  | jq '.model | {
      id,
      provider,
      supported_tasks,
      capabilities,
      avg_latency_ms,
      availability_pct,
      context_length,
      max_context_length,
      min_points,
      cost_per_request_usd
    }'

model.response.json

{
  "model": {
    "id": "amazon/chronos-bolt-base",
    "provider": "tsfm (us)",
    "status": "online",
    "supported_tasks": ["forecast"],
    "capabilities": ["forecasting", "quantile-forecasting", "multivariate", "covariates"],
    "context_length": 512,
    "max_context_length": 512,
    "min_points": 4,
    "avg_latency_ms": 240,
    "availability_pct": 99.95,
    "cost_per_request_usd": 0.00025,
    "fit_profile": {
      "recommended_use_cases": ["low-latency retail forecasting", "covariate-aware short-context demand planning"],
      "limitations": ["higher latency than bolt-class models"]
    }
  }
}

Selection signals

What to read on a model record

You do not need every metadata field for every decision. Focus on task support, context length, latency, cost, and whether the model family has the capabilities your data needs, such as covariates, quantiles, or long-context forecasting.

High-signal fields

Field	Type	Required	Description
supported_tasks	string[]	No	The first filter to apply. It tells you whether the model supports forecast, classify, impute, or anomaly flows.
context_length	integer	No	How much history the model can consider. This matters for long seasonal cycles, dense telemetry streams, and covariate-heavy payloads.
avg_latency_ms	number	No	A routing signal, not a hard SLA. Compare it with your p95 target and expected concurrency before you pick a default.
cost_per_request_usd	number	No	Public catalog pricing is unified at a flat per-call rate. Use this field to verify the published rate across surfaces.
capabilities	string[]	No	Highlights strengths such as covariates, long-context forecasting, quantiles, or multi-task support.
status / availability_pct	string / number	No	Operational hints for production routing and incident response planning.

Routing guidance

Turn discovery into a routing policy

The catalog should not just answer “what exists?” It should help you define a default model, a fallback model, and the thresholds that trigger an upgrade for specific workloads.

Filter by task first

Do not compare a forecast-only checkpoint against a multi-task model until you know your task needs. Supported task coverage narrows the field quickly and avoids false comparisons.

Pick one default, keep one fallback

Use catalog latency and cost signals to choose a default model, then keep a cheaper or more robust fallback ready for outages, quota pressure, or burst traffic.

Benchmark on your own distribution

Leaderboard results are directional. Real routing decisions should be validated on your historical series, horizon lengths, and business metrics before rollout.

routing-policy.example.json

{
  "default_model": "amazon/chronos-bolt-small",
  "fallback_model": "ibm-research/granite-timeseries-ttm-r2",
  "promotion_rules": [
    "upgrade to chronos-bolt-base when covariates are required",
    "upgrade to google/timesfm-2.0-500m-pytorch when you need more than 512 served context points"
  ],
  "review_inputs": [
    "p95 latency",
    "forecast error on holdout set",
    "cost per 10k requests"
  ]
}

Browse the live catalog

Use the hosted catalog when you want filters, cards, and richer metadata than the raw JSON surface.

Read the selection guide

Move from catalog facts into practical decision frameworks for production routing.

Compare benchmark pages

Use benchmark results as a directional input, then confirm choices on your own data.