论文标题

在随机扰动下最近的邻居算法的预测能力

Predictive Power of Nearest Neighbors Algorithm under Random Perturbation

论文作者

Xing, Yue, Song, Qifan, Cheng, Guang

论文摘要

我们考虑了最近的$ k $最近邻居($ k $ -nn)算法中的数据损坏方案,即,测试数据被随机扰动。在这种情况下,腐败水平对渐近遗憾的影响得到了仔细的特征。特别是,我们的理论分析揭示了一种相过渡现象,当腐败级别$ω$低于关键顺序(即小$ω$制度)时,渐近遗憾保持不变;当超出该秩序(即大$ω$制度)时,渐近遗憾会在多项式上恶化。令人惊讶的是,我们获得了一个负面的结果,即即使在渐近性遗憾的乘法常数的水平,经典的噪声注射方法也不会帮助提高测试性能。作为技术副产品,我们证明,在不同的模型假设下,在\ cite {XUE2017ACHIEVING}中提出的预处理的1-NN}最多将达到次优率,即使数据维度$ d> 4 $即使在预处理步骤中最佳选择$ k $也是最佳选择的。

We consider a data corruption scenario in the classical $k$ Nearest Neighbors ($k$-NN) algorithm, that is, the testing data are randomly perturbed. Under such a scenario, the impact of corruption level on the asymptotic regret is carefully characterized. In particular, our theoretical analysis reveals a phase transition phenomenon that, when the corruption level $ω$ is below a critical order (i.e., small-$ω$ regime), the asymptotic regret remains the same; when it is beyond that order (i.e., large-$ω$ regime), the asymptotic regret deteriorates polynomially. Surprisingly, we obtain a negative result that the classical noise-injection approach will not help improve the testing performance in the beginning stage of the large-$ω$ regime, even in the level of the multiplicative constant of asymptotic regret. As a technical by-product, we prove that under different model assumptions, the pre-processed 1-NN proposed in \cite{xue2017achieving} will at most achieve a sub-optimal rate when the data dimension $d>4$ even if $k$ is chosen optimally in the pre-processing step.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源