论文标题
质心聚类中的本地连通性
Local Connectivity in Centroid Clustering
论文作者
论文摘要
聚类是无监督学习的基本任务,它的目标是将数据集分组为类似对象的群集。最近有兴趣嵌入围绕聚类配方中的公平性的规范考虑。在本文中,我们提出“本地连通性”是评估质心聚类中成员沙漠的关键因素。我们使用本地连接来指代对象的本地社区提供的支持,以支持其对相关集群的成员资格。我们激励需要考虑集群分配中对象的本地连接,并提供量化给定群集中局部连接的方法。然后,我们从基于密度的聚类中利用概念并设计LOFKM,这是一种聚类方法,旨在加深聚类输出中的局部连接性,同时留在质心聚类的框架内。通过对现实世界数据集的经验评估,我们说明LOFKM以合理的成本以聚集质量的合理成本来取得显着改善,以说明该方法的有效性。
Clustering is a fundamental task in unsupervised learning, one that targets to group a dataset into clusters of similar objects. There has been recent interest in embedding normative considerations around fairness within clustering formulations. In this paper, we propose 'local connectivity' as a crucial factor in assessing membership desert in centroid clustering. We use local connectivity to refer to the support offered by the local neighborhood of an object towards supporting its membership to the cluster in question. We motivate the need to consider local connectivity of objects in cluster assignment, and provide ways to quantify local connectivity in a given clustering. We then exploit concepts from density-based clustering and devise LOFKM, a clustering method that seeks to deepen local connectivity in clustering outputs, while staying within the framework of centroid clustering. Through an empirical evaluation over real-world datasets, we illustrate that LOFKM achieves notable improvements in local connectivity at reasonable costs to clustering quality, illustrating the effectiveness of the method.