论文标题
基于分类的半监督聚类的方法,并具有成对约束
A Classification-Based Approach to Semi-Supervised Clustering with Pairwise Constraints
论文作者
论文摘要
在本文中,我们引入了一个具有成对(必须链接或不能链接)约束的半监督聚类(SSC)的神经网络框架。与现有方法相比,我们将SSC分解为两个简单的分类任务/阶段:第一阶段使用一对暹罗神经网络将未标记的点对标记为必不可少的链接或不能链接;第二阶段使用基于监督的基于神经网络的聚类方法,使用第一阶段产生的完全成对标记的数据集。提出的方法S3C2(半监督的暹罗分类器用于聚类)是由于观察到的观察,即二进制分类(例如分配成对关系)通常比通过部分监督的多级聚类容易。另一方面,作为基于分类的方法,我们的方法仅解决了明确定义的分类问题,而不是指定的聚类任务不太明确。各种数据集上的广泛实验证明了该方法的高性能。
In this paper, we introduce a neural network framework for semi-supervised clustering (SSC) with pairwise (must-link or cannot-link) constraints. In contrast to existing approaches, we decompose SSC into two simpler classification tasks/stages: the first stage uses a pair of Siamese neural networks to label the unlabeled pairs of points as must-link or cannot-link; the second stage uses the fully pairwise-labeled dataset produced by the first stage in a supervised neural-network-based clustering method. The proposed approach, S3C2 (Semi-Supervised Siamese Classifiers for Clustering), is motivated by the observation that binary classification (such as assigning pairwise relations) is usually easier than multi-class clustering with partial supervision. On the other hand, being classification-based, our method solves only well-defined classification problems, rather than less well specified clustering tasks. Extensive experiments on various datasets demonstrate the high performance of the proposed method.