MOMENT-Base

online

AutonLab/MOMENT-1-base

~125M params | 512 context | $0.00025 per forecast | MIT

MOMENT-Base is the mid-size checkpoint in AutonLab's MOMENT family, providing a balance between the lightweight Small variant and the higher-capacity Large model. It is the natural default when MOMENT-Small leaves quality on the table but MOMENT-Large is more backbone than a task needs.

It shares the same architecture and multi-task transfer capabilities as the rest of the family: a patch-based encoder-only transformer trained with masked time-series modeling, producing representations that transfer across forecasting, classification, anomaly detection, imputation, and embedding extraction. It is trained on Timeseries-PILE, built from public forecasting, classification, and anomaly-detection corpora including Informer datasets, Monash, UCR/UEA, and TSB-UAD. The upstream momentfm package warns that only the reconstruction head is pretrained and that forecasting heads must be fine-tuned, so zero-shot forecast quality should be treated as secondary to backbone reuse.

On TSFM.ai reach for MOMENT-Base as the balanced MOMENT backbone for multi-task work where you want more capacity than the Small variant without jumping to the Large model. Drop to MOMENT-Small when footprint dominates, or step up to MOMENT-Large when a downstream task rewards the extra capacity.

Model Classification

Family

MOMENT

Type

time series foundation model

Pretrained time-series model exposed on TSFM.ai for zero-shot or few-shot forecasting workloads.

Resources

HuggingFace Paper

Training Data

Timeseries-PILE, built from public forecasting, classification, and anomaly-detection corpora including Informer datasets, Monash, UCR/UEA, and TSB-UAD.

Recommended For

• Shared backbones across forecasting, anomaly detection, classification, and imputation
• Teams that want one general-purpose time-series representation model

Strengths

• Broadest multi-task scope in the hosted catalog
• Useful when the same deployment needs to cover several downstream tasks

Limitations

• Not optimized purely around one forecasting leaderboard objective
• May be heavier than needed if you only need straightforward zero-shot forecasting
• The upstream momentfm package warns that only the reconstruction head is pretrained and that forecasting heads must be fine-tuned
• Hosted forecasting quality can lag specialist zero-shot forecasters because MOMENT is primarily framed as a transferable representation backbone
• Zero-shot forecasting tends to be trend-blind — predictions may flatten regardless of clear trends in the input

Not Ideal For

• Choosing a default forecast-only model when you mostly care about zero-shot continuation quality
• Short benchmark-style trend extrapolation where specialist forecasting families are stronger

Capabilities

forecastingclassificationanomaly-detectionimputationretrieval

Specifications

Parameters: ~125M
Architecture: patch-based encoder-only transformer trained with masked time-series modeling
Context length: 512
Max context: 512
Minimum history: n/a
Recommended history: 512
Input step: n/a
Required target series: 1
Temperature: Ignored
Top P: Ignored
Max output: 1,024
Avg latency: n/a
Uptime: n/a
Plan limits: 1,000 rpm free · 1,000,000 rpm with billing
Accelerator: T4
Regions: Virginia, US
License: MIT