基于推杆的添加剂近似，以实现最佳运输

论文标题

基于推杆的添加剂近似，以实现最佳运输

A Push-Relabel Based Additive Approximation for Optimal Transport

论文作者

Lahn, Nathaniel, Raghvendra, Sharath, Zhang, Kaiyi

论文摘要

最佳运输是一种流行的距离度量，用于测量分布之间的相似性。用于计算最佳传输的精确算法可能会很慢，这激发了近似数值求解器的开发（例如，sindhorn方法）。我们引入了一种新的非常简单的组合方法，以找到ot距离的$ \ varepsilon $ - approximation。我们的算法实现了用于计算ot距离的$ o（n^2/\ varepsilon^2）$（n^2/\ varepsilon^2）的近乎最佳执行时间，对于分配问题的特殊情况，执行时间提高到$ O（n^2/\ varepsilon）$。我们的算法基于针对最小成本流问题的推送标签框架。与其他组合方法（Lahn，Mulchandani和Raghvendra，Neurips 2019）不同，我们的算法的并行执行时间为$ O（\ log n/\ varepsilon^2）。有趣的是，与Sinkhorn算法不同，我们的方法还很容易提供紧凑的运输计划，并解决了OT问题的双重配方的近似版本，它们在机器学习中都有许多应用。对于分配问题，我们同时提供了CPU实施以及利用GPU并行性的实现。实验表明，在CPU和GPU实现方面，我们的算法比sindhorn算法快，尤其是在计算高精度的匹配时。

Optimal Transport is a popular distance metric for measuring similarity between distributions. Exact algorithms for computing Optimal Transport can be slow, which has motivated the development of approximate numerical solvers (e.g. Sinkhorn method). We introduce a new and very simple combinatorial approach to find an $\varepsilon$-approximation of the OT distance. Our algorithm achieves a near-optimal execution time of $O(n^2/\varepsilon^2)$ for computing OT distance and, for the special case of the assignment problem, the execution time improves to $O(n^2/\varepsilon)$. Our algorithm is based on the push-relabel framework for min-cost flow problems. Unlike the other combinatorial approach (Lahn, Mulchandani and Raghvendra, NeurIPS 2019) which does not have a fast parallel implementation, our algorithm has a parallel execution time of $O(\log n/\varepsilon^2)$. Interestingly, unlike the Sinkhorn algorithm, our method also readily provides a compact transport plan as well as a solution to an approximate version of the dual formulation of the OT problem, both of which have numerous applications in Machine Learning. For the assignment problem, we provide both a CPU implementation as well as an implementation that exploits GPU parallelism. Experiments suggest that our algorithm is faster than the Sinkhorn algorithm, both in terms of CPU and GPU implementations, especially while computing matchings with a high accuracy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题