论文标题
Lanns:网络尺度大约最近的邻居查找系统
LANNS: A Web-Scale Approximate Nearest Neighbor Lookup System
论文作者
论文摘要
最近的邻居搜索(NNS)在信息检索,计算机视觉,机器学习,数据库和其他领域中具有广泛的应用。现有的最新算法用于最近的邻居搜索,分层可导航的小世界网络(HNSW),无法在高维度的100m记录的大型数据集上扩展。在本文中,我们提出了Lanns,这是一个端到端的平台,用于近似最近的邻居搜索,该平台为Web尺度数据集进行了扩展。 Library for Large Scale Approximate Nearest Neighbor Search (LANNS) is deployed in multiple production systems for identifying topK ($100 \leq topK \leq 200$) approximate nearest neighbors with a latency of a few milliseconds per query, high throughput of 2.5k Queries Per Second (QPS) on a single node, on large ($\sim$180M data points) high dimensional (50-2048 dimensional)数据集。
Nearest neighbor search (NNS) has a wide range of applications in information retrieval, computer vision, machine learning, databases, and other areas. Existing state-of-the-art algorithm for nearest neighbor search, Hierarchical Navigable Small World Networks(HNSW), is unable to scale to large datasets of 100M records in high dimensions. In this paper, we propose LANNS, an end-to-end platform for Approximate Nearest Neighbor Search, which scales for web-scale datasets. Library for Large Scale Approximate Nearest Neighbor Search (LANNS) is deployed in multiple production systems for identifying topK ($100 \leq topK \leq 200$) approximate nearest neighbors with a latency of a few milliseconds per query, high throughput of 2.5k Queries Per Second (QPS) on a single node, on large ($\sim$180M data points) high dimensional (50-2048 dimensional) datasets.