论文标题

用于联合GeoSparql查询的地理空间源选择器

A geospatial source selector for federated GeoSPARQL querying

论文作者

Troumpoukis, Antonis, Konstantopoulos, Stasinos, Prokopaki-Kostopoulou, Nefeli

论文摘要

背景:地理空间链接的数据带来了语义网及其技术的范围,这些数据集将资源的语义丰富的描述与其地理位置结合在一起。但是,在各种语义Web技术中,需要技术工作以实现地理空间数据的完整整合,而联合查询处理是这些技术之一。方法:在本文中,我们探讨了用一个边界多边形对数据源进行注释的想法,该多边形总结了每个数据源中资源的空间范围,并将使用此类摘要作为(附加)源选择标准,以减少一组源,这些源将被测试为潜在的相关数据。我们介绍我们的源选择方法,并讨论其正确性和实现。结果:我们使用具有不同程度的精度的三种不同类型的摘要评估了所提出的源选择,而不是不使用地理空间摘要。我们使用实际用例中的数据集和查询,将农作物类型的数据与水的可用性数据结合在一起,以实现粮食安全。实验结果表明,更复杂的摘要导致源选择时间较慢,但也更精确地排除了不需要的源。此外,我们观察到源选择运行时是(部分或完全)通过较短的计划和执行运行时间恢复。结果,联邦发动机的毫无意义的查询不会负担联邦来源的负担。结论:评估借鉴了农业环境领域的数据和查询,并表明我们的源选择方法显着提高了联合GeoSparql查询处理的有效性。

Background: Geospatial linked data brings into the scope of the Semantic Web and its technologies, a wealth of datasets that combine semantically-rich descriptions of resources with their geo-location. There are, however, various Semantic Web technologies where technical work is needed in order to achieve the full integration of geospatial data, and federated query processing is one of these technologies. Methods: In this paper, we explore the idea of annotating data sources with a bounding polygon that summarizes the spatial extent of the resources in each data source, and of using such a summary as an (additional) source selection criterion in order to reduce the set of sources that will be tested as potentially holding relevant data. We present our source selection method, and we discuss its correctness and implementation. Results: We evaluate the proposed source selection using three different types of summaries with different degrees of accuracy, against not using geospatial summaries. We use datasets and queries from a practical use case that combines crop-type data with water availability data for food security. The experimental results suggest that more complex summaries lead to slower source selection times, but also to more precise exclusion of unneeded sources. Moreover, we observe the source selection runtime is (partially or fully) recovered by shorter planning and execution runtimes. As a result, the federated sources are not burdened by pointless querying from the federation engine. Conclusions: The evaluation draws on data and queries from the agroenvironmental domain and shows that our source selection method substantially improves the effectiveness of federated GeoSPARQL query processing.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源