论文标题

KPGT:分子性质预测的图形变压器的知识引导的预训练

KPGT: Knowledge-Guided Pre-training of Graph Transformer for Molecular Property Prediction

论文作者

Li, Han, Zhao, Dan, Zeng, Jianyang

论文摘要

为分子特性预测设计准确的深度学习模型在药物和材料发现中起着越来越重要的作用。最近,由于标记的分子的稀缺性,用于学习分子图的可推广和可转移表示的自我监督的学习方法吸引了很多关注。在本文中,我们认为存在两个主要问题,从而阻碍了当前的自我监督学习方法,从获得分子财产预测的所需绩效,即,未定义的预训练前训练任务和有限的模型容量。为此,我们引入了图形变压器(KPGT)的知识引导的预训练,这是一个新型的分子图表示学习的自我监督学习框架,以减轻上述问题并改善下游分子属性预测任务的性能。更具体地说,我们首先引入了一个名为“线图变压器”(Light)的高容量模型,该模型强调化学键的重要性,主要设计用于模拟分子图的结构信息。然后,提出了一种知识引导的预训练策略来利用分子的其他知识,以指导模型以捕获大型未标记分子图的丰富结构和语义信息。广泛的计算测试表明,KPGT可以在几个分子属性预测任务上提供优于当前最新方法的性能。

Designing accurate deep learning models for molecular property prediction plays an increasingly essential role in drug and material discovery. Recently, due to the scarcity of labeled molecules, self-supervised learning methods for learning generalizable and transferable representations of molecular graphs have attracted lots of attention. In this paper, we argue that there exist two major issues hindering current self-supervised learning methods from obtaining desired performance on molecular property prediction, that is, the ill-defined pre-training tasks and the limited model capacity. To this end, we introduce Knowledge-guided Pre-training of Graph Transformer (KPGT), a novel self-supervised learning framework for molecular graph representation learning, to alleviate the aforementioned issues and improve the performance on the downstream molecular property prediction tasks. More specifically, we first introduce a high-capacity model, named Line Graph Transformer (LiGhT), which emphasizes the importance of chemical bonds and is mainly designed to model the structural information of molecular graphs. Then, a knowledge-guided pre-training strategy is proposed to exploit the additional knowledge of molecules to guide the model to capture the abundant structural and semantic information from large-scale unlabeled molecular graphs. Extensive computational tests demonstrated that KPGT can offer superior performance over current state-of-the-art methods on several molecular property prediction tasks.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源