论文标题
全球尺度的大规模特征提取从每月水文气候时间序列:统计特征,空间模式和水文相似性
Global-scale massive feature extraction from monthly hydroclimatic time series: Statistical characterizations, spatial patterns and hydrological similarity
论文作者
论文摘要
氢化气候时间序列分析侧重于几种特征类型(例如自相关,趋势,极端),这些特征类型描述了观测值的整个信息内容中的一小部分。旨在利用可用信息的很大一部分,从而提供更可靠的结果(例如,在氢化气候时间序列聚类环境中),我们在这里我们对氢化气候时间序列分析进行了不同的方式,即通过进行大量的特征提取。在这方面,我们为氢化气候可变行为表征开发了一个大数据框架。该框架依赖于大约60个不同的功能,并且是完全自动的(从某种意义上说,它不取决于手头的氢化过程)。我们将新框架应用于平均每月温度,每月降水量和平均每月河流流量的表征。应用程序是通过利用超过13 000个站点的40年时间序列来在全球范围内进行的。我们提取有关季节性,趋势,自相关,远距离依赖性和熵以及对特征类型的可解释知识。我们进一步比较了所检查的氢化气候变量类型,并确定与特征的空间变异性有关的模式。出于后一个目的,我们还提出并利用氢化气候时间序列聚类方法。这种新方法基于布雷曼的随机森林。全球规模应用程序获得的描述性和探索性见解证明了在氢化气候环境中所采用的特征汇编的有用性。此外,在新方法中所提供的群集的表征在空间上相干模式增强了对其未来剥削的信心...
Hydroclimatic time series analysis focuses on a few feature types (e.g., autocorrelations, trends, extremes), which describe a small portion of the entire information content of the observations. Aiming to exploit a larger part of the available information and, thus, to deliver more reliable results (e.g., in hydroclimatic time series clustering contexts), here we approach hydroclimatic time series analysis differently, i.e., by performing massive feature extraction. In this respect, we develop a big data framework for hydroclimatic variable behaviour characterization. This framework relies on approximately 60 diverse features and is completely automatic (in the sense that it does not depend on the hydroclimatic process at hand). We apply the new framework to characterize mean monthly temperature, total monthly precipitation and mean monthly river flow. The applications are conducted at the global scale by exploiting 40-year-long time series originating from over 13 000 stations. We extract interpretable knowledge on seasonality, trends, autocorrelation, long-range dependence and entropy, and on feature types that are met less frequently. We further compare the examined hydroclimatic variable types in terms of this knowledge and, identify patterns related to the spatial variability of the features. For this latter purpose, we also propose and exploit a hydroclimatic time series clustering methodology. This new methodology is based on Breiman's random forests. The descriptive and exploratory insights gained by the global-scale applications prove the usefulness of the adopted feature compilation in hydroclimatic contexts. Moreover, the spatially coherent patterns characterizing the clusters delivered by the new methodology build confidence in its future exploitation...