Anomaly Detection with Time Series Foundation Models
Foundation models aren't just for forecasting — they're surprisingly effective at detecting anomalies in time series data.
Time series foundation models were designed for forecasting. But one of the most immediately useful applications turns out to be something different: anomaly detection. The core idea is deceptively simple. If a model can predict what a time series should look like, then any significant deviation from that prediction is, by definition, anomalous.
This forecast-residual approach to anomaly detection is not new — statistical process control has used forecast errors for decades. What TSFMs add is the ability to do it at scale, across domains, without training a single model.
The Forecast-Residual Framework
The detection pipeline has four steps:
-
Generate a rolling forecast. Given a context window of recent historical observations, use a TSFM to produce a forecast for the next N steps. Slide the window forward and repeat.
-
Compute residuals. At each time step where both an actual observation and a forecast exist, compute the residual: the difference between what the model predicted and what actually occurred.
-
Estimate prediction intervals. For probabilistic models like Chronos (paper), prediction intervals come directly from the sampled forecast trajectories. For point-forecast models like TimesFM, intervals can be estimated using conformal prediction or by computing the empirical distribution of recent residuals.
-
Flag anomalies. If an observed value falls outside the prediction interval (e.g., beyond the 95th or 99th percentile), flag it as anomalous. The severity of the anomaly can be quantified by how far outside the interval the observation falls.
The result is a continuous anomaly score for every time step, derived entirely from the model's expectation of normal behavior.
Why TSFMs Outperform Traditional Approaches
Classical anomaly detection methods for time series — Isolation Forest, DBSCAN, one-class SVMs, autoencoders — have a fundamental limitation: they treat observations as points in feature space, often without deeply modeling the temporal structure of the data.
Isolation Forest, for example, detects outliers by measuring how easily a data point can be isolated through random partitioning. It is effective for static tabular data, but it does not understand that a value of 1000 might be perfectly normal on a Monday morning and deeply anomalous on a Sunday night. Temporal context matters, and most traditional methods ignore it or handle it crudely through hand-crafted lag features.
TSFMs capture temporal context natively. They understand seasonality, trend, autocorrelation, and complex multi-scale patterns because these are exactly the structures they learned during pretraining. A Chronos model that has seen millions of seasonal time series will produce a forecast that implicitly accounts for day-of-week effects, holidays, and gradual level shifts — meaning its residuals will only spike when something genuinely unexpected occurs.
There are additional practical advantages:
-
No training required. You do not need labeled anomaly data (which is notoriously scarce) or even enough normal data to train an autoencoder. Zero-shot inference means the TSFM can start detecting anomalies immediately on a new time series.
-
Cross-domain generalization. The same model and pipeline work for server CPU metrics, retail transaction volumes, manufacturing sensor readings, and financial indicators. No per-domain model tuning.
-
Adaptive baselines. Because the forecast updates with each new context window, the anomaly threshold automatically adapts to non-stationary behavior. If a metric gradually trends upward, the model's forecast tracks that trend, so the threshold moves with it.
A Practical Detection Pipeline
Here is what a production anomaly detection pipeline looks like using TSFM.ai's API:
Ingest: Stream time series data into a buffer. Maintain a rolling context window of the most recent observations (e.g., the last 512 data points for Chronos, or the last few thousand for TimesFM with patching).
Forecast: On each new observation (or on a schedule), call the TSFM inference endpoint with the current context window. Request a short forecast horizon — typically 1 to 10 steps ahead is sufficient for anomaly detection.
Score: Compare the incoming actual values against the forecasted values. For probabilistic models, check whether the actual falls within the p-th percentile prediction interval. For point-forecast models, compute the z-score of the residual relative to the recent residual distribution.
Alert: Apply a threshold policy. Simple approaches flag any observation outside the 99% interval. More sophisticated policies use consecutive-violation rules (e.g., three consecutive points outside the 95% interval) to reduce false positives from noise.
Feedback loop (optional): If human operators label flagged anomalies as true or false positives, use that feedback to adjust threshold sensitivity over time.
TSFM.ai exposes a dedicated /anomaly endpoint that wraps this pipeline into a single API call. For more on operationalizing this, see Building Production Forecast Pipelines. You send a time series, and the response includes anomaly scores and flagged intervals, with configurable sensitivity parameters.
Use Cases
Infrastructure monitoring. Detect abnormal CPU utilization, memory pressure, request latency, or error rates in real time. TSFMs handle the complex seasonality of production traffic (diurnal patterns, weekly cycles, deploy-related shifts) without manual threshold tuning.
Fraud detection. Transaction volume anomalies, unusual spending patterns, or abnormal account activity. The forecast-residual approach catches deviations from each account's or merchant's individual behavioral baseline.
Manufacturing quality control. Sensor readings from production lines — temperature, pressure, vibration — often exhibit subtle drift before equipment failure. TSFMs detect deviations from expected sensor trajectories, providing early warning for predictive maintenance.
Energy grid monitoring. Sudden deviations in power generation, consumption, or grid frequency can indicate equipment faults or demand-supply imbalances. TSFMs trained on energy data capture the complex seasonal and weather-driven patterns inherent in grid telemetry.
When Specialized Methods Still Win
TSFM-based anomaly detection is not universally superior. There are cases where purpose-built methods remain the better choice:
-
Multivariate correlation anomalies. If the anomaly manifests as an unusual relationship between variables (e.g., temperature and pressure diverging) rather than unusual values in any single variable, univariate TSFMs will miss it. Multivariate anomaly detection methods like MSCRED or OmniAnomaly are designed for this.
-
Extremely high-frequency data. For tick-level financial data or microsecond-resolution sensor data, the inference latency of a transformer-based TSFM may be too high for real-time detection. Lightweight statistical methods (CUSUM, EWMA) remain practical at these speeds.
-
Known anomaly patterns. If you have labeled examples of specific anomaly types (e.g., particular failure signatures in industrial equipment), supervised classification will outperform unsupervised forecast-residual detection.
The practical approach is to layer: use zero-shot TSFM-based detection as a broad, low-effort baseline, and supplement with specialized detectors where domain-specific requirements demand it. To get started, explore the available models in our model catalog, try anomaly detection in the playground, or learn more about the platform on Introducing TSFM.ai.