Public Docs
OpenAPI Source of Truth
MCP Streamable HTTP
CLI for Consumers
TSFM.ai developer documentation.
Multiple pages, one contract. API, MCP, and CLI are aligned on the same schema so teams can move from manual calls to production automation with zero drift.
Learn
Benchmarks: data sources and ranking method
The public /benchmarks page is intentionally data-driven. We do not manually curate scores. Instead we fetch official upstream leaderboard data and map model names to our hosted catalog.
Why these two benchmarks
We prioritize FEV Bench and GIFT-Eval because both are open, active, and include strong foundation-model coverage. Together they provide a practical external view of forecast quality and robustness across varied domains.
Sources
| Benchmark | Scope | Ranking metric | Links |
|---|---|---|---|
| FEV Bench | Broad realistic forecasting benchmark with open leaderboard tables. | Primary ranking on Skill Score (higher is better). | Hugging Face Space: autogluon/fev-bench Data file: |
| GIFT-Eval | General time series benchmark covering diverse datasets and frequencies. | Primary ranking on aggregated average rank (lower is better). | Hugging Face Space: Salesforce/GIFT-Eval Data file: |
How ranking is computed on TSFM.ai
- 1. Fetch upstream benchmark CSV from each official source.
- 2. Parse score columns and build top rows for each benchmark board.
- 3. Resolve leaderboard model names to hosted catalog IDs using normalized name matching and aliases.
- 4. Render rows with a model deep-link only when a hosted model match exists.
Operational notes
- The /benchmarks page pulls upstream CSVs server-side and revalidates on a fixed cache window.
- Rows are normalized against TSFM.ai hosted model IDs so each supported model can deep-link into /models/{id}.
- Not every leaderboard model is hosted; those rows intentionally show `Not hosted`.
- Benchmark providers can change schemas over time. Keep parsing logic versioned and monitored.