MOMENT-Base
onlineAutonLab/MOMENT-1-base~125M params | 512 context | $0.00025 per forecast | MIT
MOMENT-Base is the mid-size checkpoint in AutonLab's MOMENT family, providing a balance between the lightweight Small variant and the higher-capacity Large model. It shares the same masked time-series modeling architecture and multi-task transfer capabilities as the rest of the family. The upstream momentfm package warns that only the reconstruction head is pretrained and that forecasting heads must be fine-tuned, so zero-shot forecast quality should be treated as secondary to backbone reuse.
Model Classification
Family
MOMENT
Type
time series foundation model
Pretrained time-series model exposed on TSFM.ai for zero-shot or few-shot forecasting workloads.
Resources
Training Data
Timeseries-PILE, built from public forecasting, classification, and anomaly-detection corpora including Informer datasets, Monash, UCR/UEA, and TSB-UAD.
Recommended For
- • Shared backbones across forecasting, anomaly detection, classification, and imputation
- • Teams that want one general-purpose time-series representation model
Strengths
- • Broadest multi-task scope in the hosted catalog
- • Useful when the same deployment needs to cover several downstream tasks
Limitations
- • Not optimized purely around one forecasting leaderboard objective
- • May be heavier than needed if you only need straightforward zero-shot forecasting
- • The upstream momentfm package warns that only the reconstruction head is pretrained and that forecasting heads must be fine-tuned
- • Hosted forecasting quality can lag specialist zero-shot forecasters because MOMENT is primarily framed as a transferable representation backbone
- • Zero-shot forecasting tends to be trend-blind — predictions may flatten regardless of clear trends in the input
Not Ideal For
- • Choosing a default forecast-only model when you mostly care about zero-shot continuation quality
- • Short benchmark-style trend extrapolation where specialist forecasting families are stronger
Capabilities
Tags
Specifications
- Parameters
- ~125M
- Architecture
- patch-based encoder-only transformer trained with masked time-series modeling
- Context length
- 512
- Max context
- 512
- Minimum history
- n/a
- Recommended history
- 512
- Input step
- n/a
- Required target series
- 1
- Temperature
- Ignored
- Top P
- Ignored
- Max output
- 1,024
- Avg latency
- n/a
- Uptime
- n/a
- Plan limits
- 1,000 rpm free · 1,000,000 rpm with billing
- Accelerator
- T4
- Regions
- Virginia, US
- License
- MIT
Pricing
- Per forecast
- $0.00025
Performance
- Average latency
- n/a
- Availability
- n/a
- Plan limits
- 1,000 rpm free · 1,000,000 rpm with billing