论文标题

WeatherBench:用于数据驱动天气预测的基准数据集

WeatherBench: A benchmark dataset for data-driven weather forecasting

论文作者

Rasp, Stephan, Dueben, Peter D., Scher, Sebastian, Weyn, Jonathan A., Mouatadid, Soukayna, Thuerey, Nils

论文摘要

数据驱动的方法是最突出的深度学习,已成为许多领域预测的强大工具。一个自然的问题是,是否还可以使用数据驱动的方法来预测几天的全球天气模式。第一项研究表明,缺乏常见的数据集和评估指标使研究之间的比较困难。在这里,我们提出了一个基准数据集,用于数据驱动的中端天气预报,这是对大气和计算机科学家的高科学兴趣的话题。我们提供了从ERA5档案中得出的数据,该数据已经过处理,以促进机器学习模型中的使用。我们提出了简单明了的评估指标,这将使不同方法之间的直接比较。此外,我们从简单的线性回归技术,深度学习模型以及纯粹的物理预测模型中提供基线得分。该数据集可在https://github.com/pangeo-data/weatherbench上公开获取,并且伴随代码可重现,并具有用于启动的教程。我们希望该数据集能够加速数据驱动的天气预报。

Data-driven approaches, most prominently deep learning, have become powerful tools for prediction in many domains. A natural question to ask is whether data-driven methods could also be used to predict global weather patterns days in advance. First studies show promise but the lack of a common dataset and evaluation metrics make inter-comparison between studies difficult. Here we present a benchmark dataset for data-driven medium-range weather forecasting, a topic of high scientific interest for atmospheric and computer scientists alike. We provide data derived from the ERA5 archive that has been processed to facilitate the use in machine learning models. We propose simple and clear evaluation metrics which will enable a direct comparison between different methods. Further, we provide baseline scores from simple linear regression techniques, deep learning models, as well as purely physical forecasting models. The dataset is publicly available at https://github.com/pangeo-data/WeatherBench and the companion code is reproducible with tutorials for getting started. We hope that this dataset will accelerate research in data-driven weather forecasting.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源