Model selection

Model selection

Choose the right model family based on latency, cost, context length, and task requirements.

If you are not sure where to start

Start with amazon/chronos-bolt-base for most workloads. It offers a strong balance of accuracy, speed (130ms), and cost ($0.07/1M tokens) with probabilistic output support. From there, move to google/timesfm-2.0-500m-pytorch if you need longer served context, Salesforce moirai-1.1-R models for multivariate, or ibm-research/granite-timeseries-ttm-r2 if you need to minimize cost.

Decision factors

Key dimensions to consider when selecting a model.

FactorLowMidHigh
Latency requirement< 150ms150-400ms> 400ms
Budget per 1M tokens< $0.10$0.10-0.25> $0.25
History length< 512 pts512-2,048 pts> 2,048 pts
Task complexityPoint forecastProbabilisticMulti-task

Recommendations by scenario

Lowest latency

You need sub-150ms responses for real-time dashboards, alerting, or streaming applications.

Recommended

amazon/chronos-bolt-mini (60ms)amazon/chronos-bolt-small (88ms)ibm-research/granite-timeseries-ttm-r2 (95ms)

These models use direct prediction or lightweight architectures that minimize inference time.

Lowest cost

You are processing millions of series in batch and need to minimize per-request cost.

Recommended

amazon/chronos-bolt-mini ($0.02)ibm-research/granite-timeseries-ttm-r2 ($0.03)amazon/chronos-bolt-small ($0.04)

Smaller parameter counts mean lower GPU utilization per request. Combined with high rate limits, these are ideal for batch workloads.

Best forecast quality

Accuracy is the primary concern and you can tolerate higher latency and cost.

Recommended

Salesforce/moirai-1.1-R-largegoogle/timesfm-2.0-500m-pytorchamazon/chronos-bolt-base

Larger models with stronger zero-shot behavior across real workloads are the safest default place to spend latency budget. Moirai 1.1-R Large excels at multivariate forecasting, TimesFM 2.0 remains strong on longer served histories, and Chronos-Bolt Base is a pragmatic all-around baseline.

Multivariate series

You have multiple correlated variables that should be modeled jointly.

Recommended

Salesforce/moirai-1.1-R-smallSalesforce/moirai-1.1-R-baseSalesforce/moirai-1.1-R-large

Moirai's Any-Variate Attention captures cross-variate dependencies natively. Choose the size that fits your latency and cost budget.

Limited history

You have fewer than 50 historical observations (new products, new sensors).

Recommended

amazon/chronos-bolt-baseSalesforce/moirai-1.1-R-smallgoogle/timesfm-2.0-500m-pytorch

These models have strong zero-shot transfer from pre-training and produce reasonable forecasts even with minimal context.

Full model comparison

All hosted models sorted by latency. Exact served context limits come from the live catalog and can change as deployment configs change.

ModelParamsLatencyInput costBest for
amazon/chronos-bolt-mini9M60ms$0.02Ultra-low latency and edge deployment
amazon/chronos-bolt-small48M88ms$0.04Real-time and batch applications at lowest cost
ibm-research/granite-timeseries-ttm-r2~1M95ms$0.03Ultra-low-cost batch forecasting and edge deployment
amazon/chronos-bolt-base205M130ms$0.07Fast inference with strong accuracy
Salesforce/moirai-1.1-R-small14M210ms$0.09Low-cost multivariate forecasting
thuml/timer-base-84m84M260ms$0.15Patch-based zero-shot point forecasting once you have enough history
Salesforce/moirai-1.1-R-base91M330ms$0.17Balanced multivariate quality and cost
google/timesfm-2.0-500m-pytorch500M480ms$0.38Maximum forecast quality for longer served histories
Salesforce/moirai-1.1-R-large311M520ms$0.29Maximum multivariate forecast quality

Next steps

Browse modelsSee all models with live status, pricing, and detailed specifications.

QuickstartMake your first API call and see forecasts in under 5 minutes.

PlaygroundTest models interactively with your own data before committing to an integration.