论文标题
结构深群集网络
Structural Deep Clustering Network
论文作者
论文摘要
聚类是数据分析中的一项基本任务。最近,深度聚类主要源于深度学习方法,它实现了最新的表现,并引起了极大的关注。当前的深层聚类方法通常通过深度学习的强大表示能力(例如自动编码器)提高聚类结果,这表明学习有效的聚类表示是至关重要的要求。深度聚类方法的强度是从数据本身中提取有用的表示形式,而不是数据结构,而数据的结构在表示学习中受到了很少的关注。由图形卷积网络(GCN)在编码图形结构中取得的巨大成功的动机,我们提出了一个结构深的聚类网络(SDCN),以将结构信息整合到深层聚类中。具体而言,我们设计了一个交付运算符,以将自动编码器学到的表示形式转移到相应的GCN层,以及一种双重自我监督的机制来统一这两种不同的深神经体系结构并指导整个模型的更新。这样,从低阶到高阶的多个数据结构自然与自动编码器学到的多个表示形式相结合。此外,我们从理论上分析了输送运算符,即使用交付运算符,GCN将特定于自动编码器特定的表示形式改善为高阶图形正规化约束和自动编码器,有助于减轻GCN中GCN中过度平滑的问题。通过全面的实验,我们证明了我们的建议模型可以在最新的技术上持续更好地发挥作用。
Clustering is a fundamental task in data analysis. Recently, deep clustering, which derives inspiration primarily from deep learning approaches, achieves state-of-the-art performance and has attracted considerable attention. Current deep clustering methods usually boost the clustering results by means of the powerful representation ability of deep learning, e.g., autoencoder, suggesting that learning an effective representation for clustering is a crucial requirement. The strength of deep clustering methods is to extract the useful representations from the data itself, rather than the structure of data, which receives scarce attention in representation learning. Motivated by the great success of Graph Convolutional Network (GCN) in encoding the graph structure, we propose a Structural Deep Clustering Network (SDCN) to integrate the structural information into deep clustering. Specifically, we design a delivery operator to transfer the representations learned by autoencoder to the corresponding GCN layer, and a dual self-supervised mechanism to unify these two different deep neural architectures and guide the update of the whole model. In this way, the multiple structures of data, from low-order to high-order, are naturally combined with the multiple representations learned by autoencoder. Furthermore, we theoretically analyze the delivery operator, i.e., with the delivery operator, GCN improves the autoencoder-specific representation as a high-order graph regularization constraint and autoencoder helps alleviate the over-smoothing problem in GCN. Through comprehensive experiments, we demonstrate that our propose model can consistently perform better over the state-of-the-art techniques.