图形对比预训练的相似性感知的正实例抽样

论文标题

图形对比预训练的相似性感知的正实例抽样

Similarity-aware Positive Instance Sampling for Graph Contrastive Pre-training

论文作者

Liu, Xueyi, Rong, Yu, Xu, Tingyang, Sun, Fuchun, Huang, Wenbing, Huang, Junzhou

论文摘要

图对比度学习已被证明是图形神经网络（GNN）预训练的有效任务。但是，一个关键问题可能会严重阻碍现有作品中的代表权力：当前方法创建的积极实例通常会错过图表的关键信息，甚至不会产生非法实例（例如分子生成中的非化学意识图）。为了解决此问题，我们建议直接从训练集中的现有图中选择正图实例，该实例最终保持与目标图的合法性和相似性。我们的选择基于某些特定于域的成对相似性测量值以及从层次图编码图之间的相似性关系的采样。此外，我们开发了一种自适应节点级预训练方法，以动态掩盖节点在图中均匀分布。我们对来自各个域的$ 13 $图形分类和节点分类基准数据集进行了广泛的实验。结果表明，通过我们的策略预先培训的GNN模型可以胜过那些训练有素的从划痕模型以及通过现有方法获得的变体。

Graph instance contrastive learning has been proved as an effective task for Graph Neural Network (GNN) pre-training. However, one key issue may seriously impede the representative power in existing works: Positive instances created by current methods often miss crucial information of graphs or even yield illegal instances (such as non-chemically-aware graphs in molecular generation). To remedy this issue, we propose to select positive graph instances directly from existing graphs in the training set, which ultimately maintains the legality and similarity to the target graphs. Our selection is based on certain domain-specific pair-wise similarity measurements as well as sampling from a hierarchical graph encoding similarity relations among graphs. Besides, we develop an adaptive node-level pre-training method to dynamically mask nodes to distribute them evenly in the graph. We conduct extensive experiments on $13$ graph classification and node classification benchmark datasets from various domains. The results demonstrate that the GNN models pre-trained by our strategies can outperform those trained-from-scratch models as well as the variants obtained by existing methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题