Climate and Weather Forecasting with Time Series Foundation Models
Time series foundation models are finding a practical niche in climate and weather applications — not replacing physics-based models, but filling gaps they leave behind.
Weather and climate data are, at their core, time series. Temperature readings from a station every hour, daily precipitation totals, sea surface temperature measurements from buoys and satellites, wind speed and direction at turbine sites, humidity profiles, air quality indices. These signals have been collected for decades, and in many cases centuries, creating some of the longest and most densely sampled time series datasets on earth. It is no surprise that time series foundation models are proving useful here.
But the relationship between TSFMs and atmospheric science is more nuanced than "AI replaces weather forecasts." Understanding where foundation models fit — and where they do not — requires appreciating the distinction between weather forecasting and climate projection, and the role that physics-based models have played for the past half century.
Weather Forecasting vs. Climate Projection
Weather forecasting operates on timescales of hours to roughly two weeks. It answers the question: will it rain in Chicago tomorrow afternoon? The gold standard here is numerical weather prediction (NWP), which solves the governing equations of fluid dynamics and thermodynamics on a discretized grid of the atmosphere. Models like ECMWF's IFS and NOAA's GFS ingest millions of observations every six hours through data assimilation and produce global forecasts at resolutions down to roughly 9 km. These physics-based systems encode conservation of mass, momentum, and energy directly into their equations.
Climate projection operates on timescales of months to decades and beyond. It answers a different question: how will average summer temperatures in the Mediterranean shift over the next thirty years under a given emissions scenario? Climate models (GCMs) share mathematical foundations with NWP but run at coarser resolutions over much longer periods, and their skill depends more on capturing slow-evolving processes like ocean circulation and ice sheet dynamics.
TSFMs occupy a different space entirely. They are statistical models trained on observed time series data, learning temporal patterns without explicit physical constraints. This makes them complementary to NWP and GCMs rather than replacements.
Where TSFMs Add Value
The practical applications of foundation models in climate and weather fall into several categories where physics-based models are either unavailable, too expensive, or insufficiently granular.
Local station-level forecasting. Global NWP models produce forecasts on grids with cells spanning kilometers. But a vineyard owner needs temperature forecasts for their specific hillside, and a city planner needs heat projections for a particular neighborhood. TSFMs excel at ingesting a local station's historical observations and producing zero-shot forecasts at that exact location, capturing microclimate effects that global models miss.
Downscaling. NWP output at coarse resolution can be statistically downscaled to finer spatial or temporal granularity using TSFMs. Rather than running expensive high-resolution physics simulations, a foundation model can learn the relationship between coarse grid-cell averages and local observations, producing station-scale forecasts from grid-scale inputs.
Gap-filling missing observations. Weather station records are riddled with gaps — sensor failures, maintenance windows, communication outages. TSFMs can reconstruct missing segments by treating the gap as a forecast horizon conditioned on observations before and after the gap. This is critical for building complete reanalysis datasets like ERA5, which underpins much of modern climate research.
Extreme weather event detection. Identifying anomalous departures from expected patterns — an unusual temperature spike, an unprecedented precipitation event — maps directly onto the forecast-residual anomaly detection framework. TSFMs provide adaptive baselines that account for seasonality and trend, making them effective for flood and drought early warning systems.
Multivariate Dependencies Matter
Weather variables are inherently multivariate. Temperature, humidity, wind speed, and pressure are governed by the same physical processes and are tightly correlated. A sudden drop in pressure often precedes increased wind speeds and precipitation. Relative humidity constrains temperature through evaporative cooling. Forecasting any single weather variable in isolation discards information.
Foundation models with multivariate capabilities, such as Moirai, can ingest multiple correlated weather channels jointly. Moirai is particularly relevant here because its pretraining corpus (the LOTSA dataset) includes substantial climate and environmental data, giving it strong zero-shot transfer to meteorological time series without any domain-specific fine-tuning.
Real-World Applications
The downstream applications of better local weather and climate forecasting are broad. Agricultural planning depends on accurate temperature and precipitation forecasts for planting schedules, irrigation, and frost risk management. Renewable energy output forecasting — solar irradiance and wind speed predictions — is tightly coupled to weather; even small accuracy improvements translate to significant value in grid balancing and energy trading, as explored in our energy demand case study. Urban heat island monitoring uses dense station networks within cities to track localized temperature extremes, where TSFMs can provide neighborhood-level prediction intervals that inform public health responses during heat waves.
Limitations: What TSFMs Do Not Encode
It is important to be direct about what foundation models lack in this domain. Physics-based NWP models encode conservation laws: mass, energy, and momentum are conserved across every time step. TSFMs have no such constraints. A foundation model can, in principle, produce a forecast where temperature and humidity evolve in ways that violate thermodynamic relationships. For short-horizon, station-level forecasting this rarely matters in practice — the statistical patterns are strong enough to keep forecasts physically plausible. But for longer-range climate projections or scenarios outside the training distribution (such as unprecedented warming levels), the absence of physical constraints becomes a real limitation.
Recent work in physics-informed machine learning for weather, including DeepMind's GraphCast and the broader WeatherBench 2 benchmark initiative, takes a different approach: building neural network architectures that operate on the same spatial grids as NWP and incorporate physical structure into the model design. These are distinct from general-purpose TSFMs, and the two approaches serve different needs.
Getting Started
Climate and weather time series are among the most accessible domains for experimenting with foundation models. Long, well-curated public datasets exist, seasonal patterns are strong, and the practical value of improved local forecasts is immediate. Explore the available models in our model catalog or run your own weather time series through the playground to see how zero-shot forecasting performs on your station data.