论文标题

编辑距离算法的理论分析:应用的视角

Theoretical analysis of edit distance algorithms: an applied perspective

论文作者

Medvedev, Paul

论文摘要

鉴于其作为经典问题的地位及其对理论家和从业者的重要性,编辑距离提供了出色的镜头,可以通过这些视角了解算法的理论分析如何影响实际实现。从应用的角度来看,理论分析的目标是预测算法的经验性能,并用作设计在实践中表现良好的新型算法的标准。在本文中,我们系统地调查已应用于编辑距离并评估每个目标实现这两个目标的程度的理论分析技术的类型。这些技术包括传统的最坏情况分析,通过编辑距离或熵或可压缩性参数的最坏情况分析,平均案例分析,半随机模型和基于建议的模型。我们发现往绩是混合的。一方面,在实践中广泛使用的两种算法是从理论分析中诞生的,并且通过理论预测很好地捕获了它们的经验性能。另一方面,从那时起,所有使用理论分析作为码数开发的算法就没有任何实际相关性。我们通过讨论剩余的开放问题以及如何解决问题来结束。

Given its status as a classic problem and its importance to both theoreticians and practitioners, edit distance provides an excellent lens through which to understand how the theoretical analysis of algorithms impacts practical implementations. From an applied perspective, the goals of theoretical analysis are to predict the empirical performance of an algorithm and to serve as a yardstick to design novel algorithms that perform well in practice. In this paper, we systematically survey the types of theoretical analysis techniques that have been applied to edit distance and evaluate the extent to which each one has achieved these two goals. These techniques include traditional worst-case analysis, worst-case analysis parametrized by edit distance or entropy or compressibility, average-case analysis, semi-random models, and advice-based models. We find that the track record is mixed. On one hand, two algorithms widely used in practice have been born out of theoretical analysis and their empirical performance is captured well by theoretical predictions. On the other hand, all the algorithms developed using theoretical analysis as a yardstick since then have not had any practical relevance. We conclude by discussing the remaining open problems and how they can be tackled.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源