从图形卷积网络中提取知识

论文标题

从图形卷积网络中提取知识

Distilling Knowledge from Graph Convolutional Networks

论文作者

Yang, Yiding, Qiu, Jiayan, Song, Mingli, Tao, Dacheng, Wang, Xinchao

论文摘要

现有的知识蒸馏方法集中于卷积神经网络（CNN），其中像图像之类的输入样本位于网格域中，并且在很大程度上忽略了处理非网格数据的图形卷积网络（GCN）。在本文中，我们提出了最佳知识，首先将知识从预培训的GCN模型中提取的方法。为了使知识转移从教师GCN到学生，我们提出了一个地方结构，保留一个明确说明教师拓扑语义的模块。在此模块中，从教师和学生提取的本地结构信息都被提取为分布，因此最小化这些分布之间的距离可以使拓扑感知的知识转移与老师的知识转移，从而产生了紧凑而高性能的学生模型。此外，提出的方法很容易扩展到动态图模型，在该模型中，教师和学生的输入图可能有所不同。我们使用不同体系结构的GCN模型在两个不同的数据集上评估了所提出的方法，并证明我们的方法可以实现GCN模型的最新知识蒸馏性能。代码可在https://github.com/ihollywhy/distillgcn.pytorch上公开获取。

Existing knowledge distillation methods focus on convolutional neural networks (CNNs), where the input samples like images lie in a grid domain, and have largely overlooked graph convolutional networks (GCN) that handle non-grid data. In this paper, we propose to our best knowledge the first dedicated approach to distilling knowledge from a pre-trained GCN model. To enable the knowledge transfer from the teacher GCN to the student, we propose a local structure preserving module that explicitly accounts for the topological semantics of the teacher. In this module, the local structure information from both the teacher and the student are extracted as distributions, and hence minimizing the distance between these distributions enables topology-aware knowledge transfer from the teacher, yielding a compact yet high-performance student model. Moreover, the proposed approach is readily extendable to dynamic graph models, where the input graphs for the teacher and the student may differ. We evaluate the proposed method on two different datasets using GCN models of different architectures, and demonstrate that our method achieves the state-of-the-art knowledge distillation performance for GCN models. Code is publicly available at https://github.com/ihollywhy/DistillGCN.PyTorch.

下载PDF全文

下载文献需遵守相关版权规定

论文标题