Model Guide

Best time series foundation models in 2026

Six foundation models now cover most production forecasting needs. Each takes a different architectural bet — encoder-decoder, patched-decoder, universal transformer, masked pre-training, or LLM-style autoregression. This page maps the landscape so you can shortlist before you benchmark.

6 models comparedArchitecture breakdownEvaluation workflow
Model catalogGET /v1/models
curl https://api.tsfm.ai/v1/models \
  -H "Authorization: Bearer $TSFM_API_KEY" | jq .

# Returns: chronos-bolt-*, timesfm-2.0,
#   moirai-1.1-r-*, moment-1.2-*,
#   lag-llama, granite-ttm-*

List all available models with metadata, context limits, and pricing.

Models covered

Chronos, TimesFM, Moirai, MOMENT, Lag-Llama, Granite TTM

Key dimensions

Architecture, context length, covariate support, speed

Evaluation

Compare any subset through one API with the same request shape

2026 foundation model comparison

A side-by-side view of the six models that matter most for production forecasting today.

How to choose a foundation model

No single model dominates every dimension. Match model properties to your workload constraints.

If covariates matter, start with Moirai

Moirai is the only model in this group that natively handles past and future covariates. If your forecast accuracy depends on external signals — promotions, weather, holidays — it has a structural advantage over univariate-only models.

If latency and size flexibility matter, start with Chronos

Chronos offers mini through large variants so you can trade off accuracy for speed at the model level. This matters when you serve forecasts in a latency-sensitive path or need to control GPU cost per request.

If simplicity matters, start with TimesFM

TimesFM ships a single 200M-parameter model. There is no size selection decision. It performs consistently across horizons and is a strong default when you want one model that works well without tuning.

If you need more than forecasting, look at MOMENT

MOMENT supports classification, anomaly detection, and imputation alongside forecasting. If your workflow touches multiple time series tasks, a single multi-task model can simplify your architecture.

If you need minimal footprint, consider Granite TTM

IBM Granite TTM is under 1M parameters. It is the lightest model in this group and can run on CPU or edge hardware. Use it when inference cost or deployment size is the binding constraint.

If you want LLM-style generation, try Lag-Llama

Lag-Llama uses autoregressive generation with lag-based tokenization. It is lightweight and produces good probabilistic forecasts on short-to-medium horizons, especially when you want the sampling flexibility of an LLM decoder.

Evaluation workflow

Most teams converge on a production model in a few days by following this pattern.

  1. 1

    Pick 2–3 representative series from your domain

    Choose series that reflect your typical data — frequency, length, noise level, and any covariates. Avoid cherry-picking your cleanest data; include a difficult case.

  2. 2

    Run a 3-model shortlist through the API

    Use the TSFM.ai playground or API to forecast the same series with your top 3 candidates. Compare point accuracy, prediction intervals, and latency side by side.

  3. 3

    Score on held-out windows

    Backtest each model on 2–3 held-out forecast windows from your historical data. Measure MASE, CRPS, or whatever accuracy metric your team uses for production decisions.

  4. 4

    Promote the winner to production

    Lock in the model ID in your application config. The API request shape stays the same, so swapping models later is a one-line config change — not an integration rewrite.

Frequently Asked Questions

Evaluate the models on your data

Pick 2–3 candidate models, send your series to the API, and compare accuracy before committing. The request shape is the same for all models.