论文标题

停止使用k均值的肘标准以及如何选择簇数量

Stop using the elbow criterion for k-means and how to choose the number of clusters instead

论文作者

Schubert, Erich

论文摘要

使用K-均值聚类时的一个主要挑战通常是如何选择参数k,簇数。在这封信中,我们想指出,从一个共同的启发式式“肘方法”中得出糟糕的结论是非常容易的。在文献中已经知道了更好的替代方案,我们想引起人们对这些易于使用的选项的关注,而这些选项通常会表现更好。这封信是一个呼吁完全停止使用肘方法的呼吁,因为它严重缺乏理论支持,我们希望鼓励教育工作者讨论该方法的问题 - 如果根本可以在课堂上介绍该方法 - 并教授替代方案,而研究人员和审稿人则应拒绝从肘部方法中得出的结论。

A major challenge when using k-means clustering often is how to choose the parameter k, the number of clusters. In this letter, we want to point out that it is very easy to draw poor conclusions from a common heuristic, the "elbow method". Better alternatives have been known in literature for a long time, and we want to draw attention to some of these easy to use options, that often perform better. This letter is a call to stop using the elbow method altogether, because it severely lacks theoretic support, and we want to encourage educators to discuss the problems of the method -- if introducing it in class at all -- and teach alternatives instead, while researchers and reviewers should reject conclusions drawn from the elbow method.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源