论文标题
基于排序的快速和精确的固定 - 拉迪乌斯邻居搜索
Fast and exact fixed-radius neighbor search based on sorting
论文作者
论文摘要
固定 - 拉迪乌斯附近的邻居搜索是一个基本数据操作,可检索用户指定距离内的所有数据点到查询点。有有效的算法可以提供快速的近似查询响应,但是它们通常具有非常密集的索引阶段,并且需要仔细的参数调整。因此,仍然广泛使用了精确的蛮力和基于树的搜索方法。在这里,我们提出了一种新的固定radius附近的邻居搜索方法,称为SNN,该方法在索引和查询时间方面显着改善了蛮力和基于树的方法,可证明可以返回确切的结果,并且不需要参数调整。 SNN通过其第一个主要组件来修饰查询搜索空间的第一个主要组件来利用数据点的排序。使用高级基本线性代数子程序(BLA)从有效的实现中获得进一步的加速。我们对我们的方法提供了理论分析,并在使用独立时以及在DBSCAN聚类算法中应用时证明了其实际性能。
Fixed-radius near neighbor search is a fundamental data operation that retrieves all data points within a user-specified distance to a query point. There are efficient algorithms that can provide fast approximate query responses, but they often have a very compute-intensive indexing phase and require careful parameter tuning. Therefore, exact brute force and tree-based search methods are still widely used. Here we propose a new fixed-radius near neighbor search method, called SNN, that significantly improves over brute force and tree-based methods in terms of index and query time, provably returns exact results, and requires no parameter tuning. SNN exploits a sorting of the data points by their first principal component to prune the query search space. Further speedup is gained from an efficient implementation using high-level Basic Linear Algebra Subprograms (BLAS). We provide theoretical analysis of our method and demonstrate its practical performance when used stand-alone and when applied within the DBSCAN clustering algorithm.