论文标题

社交距离也有好处!

Social Distancing is Good for Points too!

论文作者

Flores-Velazco, Alejandro

论文摘要

最近的邻居规则是一种众所周知的分类技术,鉴于标有标记点的训练集P,将任何未标记的查询点分类为P.最近的最接近点的标签。最近的邻居冷凝问题旨在减少训练集而不损害最近邻居规则的准确性。 FCNN是最受欢迎的凝结算法。它本质上是启发式的,理论上的结果很少。在本文中,我们解决了一个问题,即是否可以证明合理的上限对于FCNN选择的子集的大小。首先,我们表明,当要点太近时,算法的行为可能会差,迫使其选择多于必要的点。然后,我们成功修改算法以避免这种情况,从而强加选择点“保持一定距离”。这种修改足以证明有用的上限,并为算法提供了近似保证。

The nearest-neighbor rule is a well-known classification technique that, given a training set P of labeled points, classifies any unlabeled query point with the label of its closest point in P. The nearest-neighbor condensation problem aims to reduce the training set without harming the accuracy of the nearest-neighbor rule. FCNN is the most popular algorithm for condensation. It is heuristic in nature, and theoretical results for it are scarce. In this paper, we settle the question of whether reasonable upper-bounds can be proven for the size of the subset selected by FCNN. First, we show that the algorithm can behave poorly when points are too close to each other, forcing it to select many more points than necessary. We then successfully modify the algorithm to avoid such cases, thus imposing that selected points should "keep some distance". This modification is sufficient to prove useful upper-bounds, along with approximation guarantees for the algorithm.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源