TTM-R3

online

ibm-research/ttm-r3

~1.4M (Lite) to ~35M params | 512 context | $0.00025 per forecast | CC-BY-NC-SA-4.0

TTM-R3 is IBM Research's March 31, 2026 refresh of the TinyTimeMixer family and the most capable rung of the R1 → R2 → R3 ladder. While still classified as `tinytimemixer` under the hood, it moves the line from a point forecaster toward a probabilistic, structure-aware one, and it does so without abandoning the compact footprint that defines TTM. IBM reports a 15–50x inference speedup vs state-of-the-art forecasters alongside a meaningful accuracy improvement over TTM-R2.

Architecturally, R3 adds trend-residual decomposition, a multi-quantile probabilistic forecasting head, gated attention, FFT-based frequency embeddings, and learnable sequence-level register tokens, while preserving the compact ~1.4M (Lite) to ~35M parameter footprint that keeps TTM practical on CPUs and lightweight hosted inference. It is trained on IBM's GiftEvalPretrain subset plus KernelSynth-style synthetic augmentation, a corpus aligned with the broader TTM family rather than narrowed to a single benchmark. One caveat to plan around: the current upstream checkpoint and toolkit path still emit uninitialized-weight warnings during direct load, so short deterministic continuation quality should be validated on your own holdout before production reliance.

On TSFM.ai reach for R3 when you need quantile and probabilistic outputs from the tiny TTM design, or want the decomposition-aware accuracy gains over R2 — but mind the license. R3 is hosted under a pass-through compute posture and ships under CC-BY-NC-SA-4.0 (research / non-commercial use only), so you are responsible for ensuring your intended use falls within the upstream license — see section 7 of our Terms of Service. For commercial point forecasting at the same tiny footprint, stay on the Apache-2.0-licensed TTM-R2 instead.

Model Classification

Family

TinyTimeMixer

Type

time series foundation model

Pretrained time-series model exposed on TSFM.ai for zero-shot or few-shot forecasting workloads.

Resources

HuggingFace Paper

Training Data

IBM's GiftEvalPretrain subset plus KernelSynth-style synthetic augmentation; corpus aligned with the TTM family rather than narrowed to a single benchmark.

Recommended For

• CPU-friendly or latency-sensitive forecasting baselines
• Fast zero-shot checks before escalating to larger TSFMs

Strengths

• Very small checkpoints with efficient deployment characteristics
• Useful lightweight baseline for standard public forecasting workloads

Limitations

• Lower ceiling than larger modern TSFM families on broad zero-shot leaderboards
• Checkpoint families are tuned around specific context and prediction settings
• The hosted TTM-R3 path still needs careful validation on short deterministic trend extrapolation before using it as a default production forecaster

Not Ideal For

• Default routing for short smooth trend-continuation workloads without a holdout check
• Commercial production use unless the upstream CC-BY-NC-SA-4.0 license fits your use case

Capabilities

forecastingquantile-forecastingprobabilistic-forecastingmultivariatezero-shothigh-throughput

Specifications

Parameters: ~1.4M (Lite) to ~35M
Architecture: TinyTimeMixer with trend-residual decomposition, gated attention, multi-quantile head, and FFT embeddings
Context length: 512
Max context: 512
Minimum history: n/a
Recommended history: 512
Input step: n/a
Required target series: 1
Temperature: Ignored
Top P: Ignored
Max output: 1,024
Avg latency: n/a
Uptime: n/a
Plan limits: 1,000 rpm free · 1,000,000 rpm with billing
Accelerator: T4
Regions: Virginia, US
License: CC-BY-NC-SA-4.0

Pricing

Per forecast: $0.00025

Performance

Average latency: n/a
Availability: n/a
Plan limits: 1,000 rpm free · 1,000,000 rpm with billing