Introducing TSFM.ai: The Unified API for Time Series
We're launching TSFM.ai — a single API that gives you access to every major time series foundation model with automatic routing and optimization.
Introducing TSFM.ai: The Unified API for Time Series
The time series foundation model landscape has exploded over the past eighteen months. Amazon released Chronos, Google shipped TimesFM, Salesforce published Moirai, and Carnegie Mellon introduced MOMENT. Each model brings genuine breakthroughs in zero-shot forecasting, anomaly detection, and temporal representation learning.
But if you have actually tried to use these models in production, you know the reality is far messier than the papers suggest.
The Fragmentation Problem
Every major TSFM ships with its own input format, preprocessing expectations, and inference stack. Chronos expects tokenized time series fed through a T5-derived architecture and outputs quantile predictions via binned distributions. TimesFM takes raw float arrays with explicit frequency tokens and returns point forecasts with optional prediction intervals. Moirai requires variable-length multi-series batches with attention masks. MOMENT uses a patched embedding scheme with its own normalization conventions.
If you want to evaluate which model works best for your data, you need to build and maintain separate inference pipelines for each one. You need separate GPU deployments, separate preprocessing code, and separate postprocessing logic to normalize the outputs into a common format. For most teams, this means picking one model and hoping it works, rather than systematically choosing the best tool for each forecasting task.
We built TSFM.ai to solve this.
One API, Every Model
TSFM.ai is a unified inference API that gives you access to every major time series foundation model through a single endpoint. You send your time series data in one consistent format and get back forecasts, prediction intervals, and anomaly scores in one consistent response schema, regardless of which model is running underneath.
A forecast request looks like this: you POST a JSON body with your historical values, specify a prediction horizon, and optionally indicate which model you want. The response returns point forecasts, lower and upper prediction intervals at your chosen confidence level, and metadata about the model that served the request.
If you do not specify a model, our automatic routing selects one for you. The router considers your series length, frequency, the presence of trend and seasonality, and the prediction horizon to match your data to the model with the strongest expected performance for those characteristics. This is informed by our internal benchmark suite run across thousands of series from Monash, ETT, Weather, and proprietary evaluation datasets.
What Ships Today
The initial release includes several capabilities we think matter for production forecasting.
Model Catalog. The full catalog includes Chronos (Small through Large), Chronos-Bolt, TimesFM 1.0 and 2.0, Moirai (Small, Base, Large), and MOMENT-1. Each model card shows supported context lengths, optimal use cases, and latency characteristics.
Prediction Intervals. Every model returns calibrated prediction intervals. For models that produce distributional outputs natively (like Chronos), we extract quantiles directly. For point-forecast models, we use conformal prediction to generate distribution-free intervals with guaranteed coverage.
Anomaly Detection. A dedicated /anomaly endpoint scores each point in your series against the model's learned distribution. This works particularly well with Chronos and Moirai, where the model's predictive uncertainty provides a natural anomaly signal.
Interactive Playground. A browser-based interface where you can paste or upload time series data, select models, adjust parameters, and visualize forecasts in real time. Useful for quick experimentation before writing integration code.
Architecture
Under the hood, TSFM.ai runs a GPU-optimized inference backend inspired by the continuous batching techniques used in LLM serving systems like vLLM. Time series inference has different compute characteristics than text generation, particularly around batch size sensitivity and the absence of autoregressive decoding in many architectures, so we built a scheduler tuned for these workloads.
The serving layer runs on Google Cloud Run with GPU support, giving us horizontal scaling without managing Kubernetes clusters. Cold start times are managed through minimum instance counts for popular models, and we use model sharding across instances to keep the full catalog warm without over-provisioning.
Developer Experience
We are shipping with a REST API and an official Python SDK. The SDK handles authentication, request serialization, retries with exponential backoff, and response parsing. It also includes a tsfm.evaluate module for running standardized accuracy benchmarks against your own holdout data.
pip install tsfm-ai
Documentation covers quickstart guides, API reference, model selection guidance, and worked examples for common forecasting scenarios.
Early Access
TSFM.ai is launching in early access today. We are onboarding teams in batches to ensure we can maintain inference quality and response times as we scale. If you work with time series data in production and want access, join the waitlist on our homepage.
We will be publishing benchmarks, integration guides, and deep dives on individual models over the coming weeks. If there is a specific model, use case, or comparison you want us to cover, reach out. We are building this for practitioners, and your input shapes the roadmap.