Best time series foundation models in 2026
Six foundation models now cover most production forecasting needs. Each takes a different architectural bet — encoder-decoder, patched-decoder, universal transformer, masked pre-training, or LLM-style autoregression. This page maps the landscape so you can shortlist before you benchmark.
curl https://api.tsfm.ai/v1/models \
-H "Authorization: Bearer $TSFM_API_KEY" | jq .
# Returns: chronos-bolt-*, timesfm-2.0,
# moirai-1.1-r-*, moment-1.2-*,
# lag-llama, granite-ttm-*List all available models with metadata, context limits, and pricing.
Models covered
Chronos, TimesFM, Moirai, MOMENT, Lag-Llama, Granite TTM
Key dimensions
Architecture, context length, covariate support, speed
Evaluation
Compare any subset through one API with the same request shape
2026 foundation model comparison
A side-by-side view of the six models that matter most for production forecasting today.
How to choose a foundation model
No single model dominates every dimension. Match model properties to your workload constraints.
If covariates matter, start with Moirai
Moirai is the only model in this group that natively handles past and future covariates. If your forecast accuracy depends on external signals — promotions, weather, holidays — it has a structural advantage over univariate-only models.
If latency and size flexibility matter, start with Chronos
Chronos offers mini through large variants so you can trade off accuracy for speed at the model level. This matters when you serve forecasts in a latency-sensitive path or need to control GPU cost per request.
If simplicity matters, start with TimesFM
TimesFM ships a single 200M-parameter model. There is no size selection decision. It performs consistently across horizons and is a strong default when you want one model that works well without tuning.
If you need more than forecasting, look at MOMENT
MOMENT supports classification, anomaly detection, and imputation alongside forecasting. If your workflow touches multiple time series tasks, a single multi-task model can simplify your architecture.
If you need minimal footprint, consider Granite TTM
IBM Granite TTM is under 1M parameters. It is the lightest model in this group and can run on CPU or edge hardware. Use it when inference cost or deployment size is the binding constraint.
If you want LLM-style generation, try Lag-Llama
Lag-Llama uses autoregressive generation with lag-based tokenization. It is lightweight and produces good probabilistic forecasts on short-to-medium horizons, especially when you want the sampling flexibility of an LLM decoder.
Evaluation workflow
Most teams converge on a production model in a few days by following this pattern.
- 1
Pick 2–3 representative series from your domain
Choose series that reflect your typical data — frequency, length, noise level, and any covariates. Avoid cherry-picking your cleanest data; include a difficult case.
- 2
Run a 3-model shortlist through the API
Use the TSFM.ai playground or API to forecast the same series with your top 3 candidates. Compare point accuracy, prediction intervals, and latency side by side.
- 3
Score on held-out windows
Backtest each model on 2–3 held-out forecast windows from your historical data. Measure MASE, CRPS, or whatever accuracy metric your team uses for production decisions.
- 4
Promote the winner to production
Lock in the model ID in your application config. The API request shape stays the same, so swapping models later is a one-line config change — not an integration rewrite.
Frequently Asked Questions
Evaluate the models on your data
Pick 2–3 candidate models, send your series to the API, and compare accuracy before committing. The request shape is the same for all models.