Models
Models and discovery
Discover models, inspect capabilities, and decide when to route by latency, cost, context length, or task support.
Catalog endpoints
The fastest discovery loop is usually: list models, shortlist by capability or family, inspect one or two details, then test the top candidates on real data. The hosted catalog UI at `/models` is a friendly wrapper around these same catalog surfaces.
Discovery endpoints
| Field | Type | Required | Description |
|---|---|---|---|
| GET /api/models | listModels | No | Browse the hosted catalog with filters for provider, capability, status, or free-text search. |
| GET /api/models/{modelId} | getModelById | No | Inspect one model in detail, including capabilities, latency signals, hosting metadata, and fit profile guidance. |
| GET /v1/models | listOpenAiModels | No | OpenAI-compatible model list for clients that expect the /v1/models surface. |
| GET /v1/models/{modelId} | getOpenAiModelById | No | OpenAI-compatible detail endpoint for one hosted model id. |
Query patterns
List first, then inspect detail
The list endpoint is for narrowing the field. The detail endpoint is where you confirm operational metadata such as capabilities, latency, pricing, and fit profile before you promote a model into routing policy.
List endpoint filters
| Field | Type | Required | Description |
|---|---|---|---|
| q | string | No | Free-text search over model ids and descriptive metadata. Best starting point when you know the family name, such as chronos or moirai. |
| capability | string | No | Filter for hosted capabilities such as covariates, quantile-forecasting, or long-context. |
| status | string | No | Operational filter for states such as online so your selection flow ignores inactive entries. |
| provider | string | No | Narrow to one provider namespace when you are comparing variants within a family or region. |
curl -s "https://api.tsfm.ai/api/models?q=chronos" | jq '.models | map({
id,
supported_tasks,
avg_latency_ms
})'curl -s "https://api.tsfm.ai/api/models?capability=covariates&status=online&q=chronos" \
| jq '.models[] | {
id,
supported_tasks,
capabilities,
avg_latency_ms,
context_length
}'{
"models": [
{
"id": "amazon/chronos-bolt-base",
"provider": "tsfm (us)",
"supported_tasks": ["forecast"],
"capabilities": ["forecasting", "quantile-forecasting", "multivariate", "covariates"],
"avg_latency_ms": 240,
"context_length": 8192,
"availability_pct": 99.95
},
{
"id": "amazon/chronos-bolt-small",
"provider": "tsfm (us)",
"supported_tasks": ["forecast"],
"capabilities": ["forecasting", "quantile-forecasting", "high-throughput"],
"avg_latency_ms": 120,
"context_length": 4096,
"availability_pct": 99.97
}
]
}curl -s https://api.tsfm.ai/api/models/amazon%2Fchronos-bolt-base \
| jq '.model | {
id,
provider,
supported_tasks,
capabilities,
avg_latency_ms,
availability_pct,
context_length,
input_cost_per_1m,
output_cost_per_1m
}'{
"model": {
"id": "amazon/chronos-bolt-base",
"provider": "tsfm (us)",
"status": "online",
"supported_tasks": ["forecast"],
"capabilities": ["forecasting", "quantile-forecasting", "multivariate", "covariates"],
"context_length": 8192,
"avg_latency_ms": 240,
"availability_pct": 99.95,
"input_cost_per_1m": 0.5,
"output_cost_per_1m": 1.5,
"fit_profile": {
"recommended_use_cases": ["long-context demand planning", "covariate-aware retail forecasting"],
"limitations": ["higher latency than bolt-class models"]
}
}
}Selection signals
What to read on a model record
You do not need every metadata field for every decision. Focus on task support, context length, latency, cost, and whether the model family has the capabilities your data needs, such as covariates, quantiles, or long-context forecasting.
High-signal fields
| Field | Type | Required | Description |
|---|---|---|---|
| supported_tasks | string[] | No | The first filter to apply. It tells you whether the model supports forecast, classify, impute, or anomaly flows. |
| context_length | integer | No | How much history the model can consider. This matters for long seasonal cycles, dense telemetry streams, and covariate-heavy payloads. |
| avg_latency_ms | number | No | A routing signal, not a hard SLA. Compare it with your p95 target and expected concurrency before you pick a default. |
| input_cost_per_1m / output_cost_per_1m | number | No | Public catalog pricing is unified across models, so this is mainly useful for displaying the current published rate and verifying that every surface agrees on it. |
| capabilities | string[] | No | Highlights strengths such as covariates, long-context forecasting, quantiles, or multi-task support. |
| status / availability_pct | string / number | No | Operational hints for production routing and incident response planning. |
Routing guidance
Turn discovery into a routing policy
The catalog should not just answer “what exists?” It should help you define a default model, a fallback model, and the thresholds that trigger an upgrade for specific workloads.
Filter by task first
Do not compare a forecast-only checkpoint against a multi-task model until you know your task needs. Supported task coverage narrows the field quickly and avoids false comparisons.
Pick one default, keep one fallback
Use catalog latency and cost signals to choose a default model, then keep a cheaper or more robust fallback ready for outages, quota pressure, or burst traffic.
Benchmark on your own distribution
Leaderboard results are directional. Real routing decisions should be validated on your historical series, horizon lengths, and business metrics before rollout.
{
"default_model": "amazon/chronos-bolt-small",
"fallback_model": "ibm/ttm-r2",
"promotion_rules": [
"upgrade to chronos-bolt-base when covariates are required",
"upgrade to google/timesfm-2.0-500m when context_length exceeds 4096"
],
"review_inputs": [
"p95 latency",
"forecast error on holdout set",
"cost per 10k requests"
]
}Browse the live catalog
Use the hosted catalog when you want filters, cards, and richer metadata than the raw JSON surface.
Read the selection guide
Move from catalog facts into practical decision frameworks for production routing.
Compare benchmark pages
Use benchmark results as a directional input, then confirm choices on your own data.