Researchers from Microsoft's Autonomous Systems and Robotics Research group have open-sourced ClimaX, a deep learning foundation model for weather and climate modeling. ClimaX can be fine-tuned for a variety of prediction tasks and performs as well as or better than state-of-the-art models on several benchmarks.
ClimaX is based on the Vision Transformer (ViT) model, originally developed for processing image data, but with two major modifications. The first is variable tokenization, which allows the model to accept data from datasets with different numbers of input variables. The other is variable aggregation, which combines all input variables for a given spatial location. The model is pre-trained on five datasets from the CMIP6 collection. According to the research team:
We are excited to release ClimaX with the aim of furthering data-driven weather and climate modeling. Our goal is to allow anyone to easily use the latest Machine Learning methods to address multitude of problems, ranging from near-term prediction at a local scale to modeling long-term processes that involve weather and climate variables. ClimaX takes a big step forward towards the idea of a single starting point for a variety of such tasks. We can’t wait to see what the future holds for this emerging field.
Most weather forecasting is done using numerical methods of solving differential equations which model atmospheric physics. One drawback of this method is the computation needed to achieve results at high spatial resolution. A new trend is to use data-driven deep learning models, which can sometimes produce good results with less computation. In 2021, InfoQ covered DeepMind's Deep Generative Models of Rainfall system for short-term precipitation forecasts. In 2022, Google Research published a paper in Nature about MetNet-2, another short-term precipitation forecasting model.
Inspired by the success of pre-trained foundation language models, which can be fine-tuned for state-of-the-art performance on a variety of downstream NLP tasks, the Microsoft team decided to take the same approach for weather and climate prediction tasks. The basic ClimaX architecture is an image-to-image vision transformer: the input is a two-dimensional grid, but instead of each grid element containing RGB pixel values, it contains several heterogeneous weather variables such as temperature and air pressure. The task of the model is to output an image which represents the weather in a future time.
ClimaX Model Architecture. Image Source: https://arxiv.org/abs/2301.10343
The team evaluated ClimaX by fine-tuning it for tasks where the input variables were similar to those used in pre-training as well as tasks which use variables the model has never seen. The first type of task included global weather forecasting, regional weather forecasting, and sub-seasonal to seasonal prediction. The second type used the ClimateBench benchmark for climate prediction; in this case the input variables are concentrations of gasses such as carbon dioxide, which was not used in pre-training. Compared to baselines, ClimaX performed better at predicting temperatures, but underperformed in predicting precipitation.
ClimaX team member Tung Nguyen discussed the work on Twitter. In response, atmospheric scientist David Gold pointed out:
In your paper, you didn’t report benchmarks for the regional task against state-of-science regional NWP, nor the S2S task against SubX, NMME, etc. Would be good to see such comps.
Nguyen replied that the team would try those comparisons soon.
The ClimaX source code is available on GitHub. Pre-trained model checkpoint files can be downloaded from the ClimaX website.