论文标题
识别空间数据中的主要特征
Identification of Dominant Features in Spatial Data
论文作者
论文摘要
空间数据的主要特征是从基于位置的变化中出现的连接结构或模式,并在特定的量表或分辨率下表现出来。为了识别主要特征,我们提出了多分辨率分解和变量函数函数估计的顺序应用。多分辨率分解将数据分离为添加剂组件,并以这种方式可以识别其主要特征。针对任意网格的空间数据开发了一种专用的多分辨率分解方法,其中基础模型包括一个精确和空间重量矩阵以捕获空间相关。通过在不同的尺度上平滑数据,将数据分离为它们的组件,以使较大的尺度具有更长的空间相关范围。此外,我们的模型可以处理缺失值,这通常在应用程序中很有用。变量图功能估计可用于描述空间数据中的属性。因此,估计每个组件确定其有效范围的功能,从而评估了主要特征的宽度扩展。最后,贝叶斯分析能够推断已确定的主要特征,并判断它们是否可靠地不同。该方法的有效实现主要依赖于稀疏的矩阵数据结构和算法。通过将方法应用于模拟数据,我们证明了它的适用性和理论声音。在使用空间数据的学科中,这种方法可以导致新的见解,因为我们通过识别森林数据集中的主要特征来体现。在该应用中,主要特征的宽度扩展具有生态解释,即物种相互作用范围,其估计值支持生态系统特性(例如生物多样性指数)的推导。
Dominant features of spatial data are connected structures or patterns that emerge from location-based variation and manifest at specific scales or resolutions. To identify dominant features, we propose a sequential application of multiresolution decomposition and variogram function estimation. Multiresolution decomposition separates data into additive components, and in this way enables the recognition of their dominant features. A dedicated multiresolution decomposition method is developed for arbitrary gridded spatial data, where the underlying model includes a precision and spatial-weight matrix to capture spatial correlation. The data are separated into their components by smoothing on different scales, such that larger scales have longer spatial correlation ranges. Moreover, our model can handle missing values, which is often useful in applications. Variogram function estimation can be used to describe properties in spatial data. Such functions are therefore estimated for each component to determine its effective range, which assesses the width-extent of the dominant feature. Finally, Bayesian analysis enables the inference of identified dominant features and to judge whether these are credibly different. The efficient implementation of the method relies mainly on a sparse-matrix data structure and algorithms. By applying the method to simulated data we demonstrate its applicability and theoretical soundness. In disciplines that use spatial data, this method can lead to new insights, as we exemplify by identifying the dominant features in a forest dataset. In that application, the width-extents of the dominant features have an ecological interpretation, namely the species interaction range, and their estimates support the derivation of ecosystem properties such as biodiversity indices.