TTM-R3
onlineibm-research/ttm-r3~1.4M (Lite) to ~35M params | 512 context | $0.00025 per forecast | CC-BY-NC-SA-4.0
TTM-R3 is IBM Research's March 31, 2026 refresh of the TinyTimeMixer family. While still classified as `tinytimemixer` under the hood, R3 adds trend-residual decomposition, a multi-quantile probabilistic forecasting head, gated attention, FFT-based frequency embeddings, and learnable sequence-level register tokens, while preserving the compact ~1.4M–35M parameter footprint that makes TTM practical on CPUs and lightweight hosted inference. IBM reports a 15–50x inference speedup vs state-of-the-art forecasters and a meaningful accuracy improvement over TTM-R2, but the current upstream checkpoint and toolkit path still emit uninitialized-weight warnings during direct load, so short deterministic continuation quality should be validated on your own holdout before production reliance. Hosted on TSFM.ai under a pass-through compute posture: TTM-R3 ships under the CC-BY-NC-SA-4.0 license (research / non-commercial use only), so you are responsible for ensuring your intended use falls within the upstream license — see section 7 of our Terms of Service.
Model Classification
Family
TinyTimeMixer
Type
time series foundation model
Pretrained time-series model exposed on TSFM.ai for zero-shot or few-shot forecasting workloads.
Resources
Training Data
IBM's GiftEvalPretrain subset plus KernelSynth-style synthetic augmentation; corpus aligned with the TTM family rather than narrowed to a single benchmark.
Recommended For
- • CPU-friendly or latency-sensitive forecasting baselines
- • Fast zero-shot checks before escalating to larger TSFMs
Strengths
- • Very small checkpoints with efficient deployment characteristics
- • Useful lightweight baseline for standard public forecasting workloads
Limitations
- • Lower ceiling than larger modern TSFM families on broad zero-shot leaderboards
- • Checkpoint families are tuned around specific context and prediction settings
- • The hosted TTM-R3 path still needs careful validation on short deterministic trend extrapolation before using it as a default production forecaster
Not Ideal For
- • Default routing for short smooth trend-continuation workloads without a holdout check
- • Commercial production use unless the upstream CC-BY-NC-SA-4.0 license fits your use case
Capabilities
Tags
Specifications
- Parameters
- ~1.4M (Lite) to ~35M
- Architecture
- TinyTimeMixer with trend-residual decomposition, gated attention, multi-quantile head, and FFT embeddings
- Context length
- 512
- Max context
- 512
- Minimum history
- n/a
- Recommended history
- 512
- Input step
- n/a
- Required target series
- 1
- Temperature
- Ignored
- Top P
- Ignored
- Max output
- 1,024
- Avg latency
- n/a
- Uptime
- n/a
- Plan limits
- 1,000 rpm free · 1,000,000 rpm with billing
- Accelerator
- T4
- Regions
- Virginia, US
- License
- CC-BY-NC-SA-4.0
Pricing
- Per forecast
- $0.00025
Performance
- Average latency
- n/a
- Availability
- n/a
- Plan limits
- 1,000 rpm free · 1,000,000 rpm with billing