Xiaohongshu logo

Time-MoE-50M

online
Maple728/TimeMoE-50M

0.1B total / 50M active params | 4K context | $0.00025 per forecast | Apache-2.0

Time-MoE-50M is the smaller public checkpoint in Xiaohongshu's Time-MoE family. The family naming refers to its active-parameter regime, but the published checkpoint stores a larger total parameter count because inactive experts remain part of the release. It is the lightest public way to evaluate the sparse Time-MoE design.

Model Classification

Family

TimeMoE

Type

time series foundation model

Pretrained time-series model exposed on TSFM.ai for zero-shot or few-shot forecasting workloads.

Training Data

Time-300B, a multi-domain corpus spanning more than nine domains and over 300B time points, per the official paper and repository.

Recommended For

  • Long-context forecasting with sparse-expert scaling
  • Teams exploring MoE behavior in time-series foundation models

Strengths

  • Sparse experts offer large-capacity behavior without a fully dense footprint
  • Well aligned to long-context autoregressive forecasting

Limitations

  • MoE operational behavior can be less familiar than dense baselines
  • Not the best first pick if you just need a simple compact deployment

Capabilities

forecastingzero-shothigh-throughput

Tags

timemoemoeautoregressivecost-efficient

Specifications

Parameters
0.1B total / 50M active
Architecture
decoder-only transformer with sparse Mixture-of-Experts routing
Context length
4,096
Max context
4,096
Minimum history
n/a
Recommended history
n/a
Input step
n/a
Required target series
1
Temperature
Ignored
Top P
Ignored
Max output
1,024
Avg latency
n/a
Uptime
n/a
Plan limits
1,000 rpm free · 1,000,000 rpm with billing
Accelerator
T4
Regions
Virginia, US
License
Apache-2.0

Pricing

Per forecast
$0.00025

Performance

Average latency
n/a
Availability
n/a
Plan limits
1,000 rpm free · 1,000,000 rpm with billing