论文标题
clnode:节点分类的课程学习
CLNode: Curriculum Learning for Node Classification
论文作者
论文摘要
节点分类是一个基于图形的基本任务,旨在预测未标记的节点的类别,对于哪种图形神经网络(GNN)是最新方法。当前的GNN假设训练集中的节点在训练期间同样贡献。但是,训练节点的质量差异很大,而GNN的性能可能会受到两种类型的低质量培训节点的损害:(1)位于缺乏相应类别典型特征的类边界的类间节点。由于GNN是数据驱动的方法,因此对这些节点进行培训可能会降低准确性。 (2)标签错误的节点。在实际图中,节点通常被错误标记,这会大大降低GNN的鲁棒性。为了减轻低质量培训节点的有害效果,我们提出了clnode,该clnode采用选择性培训策略来根据节点的质量来培训GNN。具体而言,我们首先设计了一个多人难度测量器,以准确测量训练节点的质量。然后,根据测得的素质,我们采用培训调度程序,该调度程序选择适当的培训节点来培训每个时期的GNN。为了评估cLNODE的有效性,我们通过将其纳入六个代表性的骨干GNN中进行了广泛的实验。现实世界网络上的实验结果表明,clnode是一个通用框架,可以与各种GNN结合起来,以提高其准确性和鲁棒性。
Node classification is a fundamental graph-based task that aims to predict the classes of unlabeled nodes, for which Graph Neural Networks (GNNs) are the state-of-the-art methods. Current GNNs assume that nodes in the training set contribute equally during training. However, the quality of training nodes varies greatly, and the performance of GNNs could be harmed by two types of low-quality training nodes: (1) inter-class nodes situated near class boundaries that lack the typical characteristics of their corresponding classes. Because GNNs are data-driven approaches, training on these nodes could degrade the accuracy. (2) mislabeled nodes. In real-world graphs, nodes are often mislabeled, which can significantly degrade the robustness of GNNs. To mitigate the detrimental effect of the low-quality training nodes, we present CLNode, which employs a selective training strategy to train GNN based on the quality of nodes. Specifically, we first design a multi-perspective difficulty measurer to accurately measure the quality of training nodes. Then, based on the measured qualities, we employ a training scheduler that selects appropriate training nodes to train GNN in each epoch. To evaluate the effectiveness of CLNode, we conduct extensive experiments by incorporating it in six representative backbone GNNs. Experimental results on real-world networks demonstrate that CLNode is a general framework that can be combined with various GNNs to improve their accuracy and robustness.