论文标题

FUNQG:分子表示通过商图学习

FunQG: Molecular Representation Learning Via Quotient Graphs

论文作者

Hajiabolhassan, Hossein, Taheri, Zahra, Hojatnia, Ali, Yeganeh, Yavar Taheri

论文摘要

学习表达性分子表示对于促进分子特性的准确预测至关重要。尽管图形神经网络(GNNS)在分子表示学习中取得了显着进步,但它们通常面临诸如邻居探索,不足,越来越平滑和过度方面的局限性。同样,由于参数数量大,GNN通常具有较高的计算成本。通常,当面对相对大尺寸的图形或使用更深的GNN模型体系结构时,这种限制会出现或增加。克服这些问题的一个想法是将分子图简化为一个小型,丰富且有用的信息,这对培训GNN的效率更有效,更具挑战性。为此,我们提出了一个新颖的分子图粗化框架,名为FUNQG利用功能组,作为分子的有影响力的构件来确定其性质,基于称为商图的图理论概念。通过实验,我们表明所产生的信息图比分子图小得多,因此是训练GNN的良好候选者。我们将FUNQG应用于流行的分子属性预测基准,然后比较所获得的数据集上一些流行的基线GNN的性能与原始数据集中的几个最先进的基线的性能。通过实验,此方法除了急剧减少参数数量和低计算成本外,该方法在各种数据集上的表现都显着优于以前的基线。因此,FUNQG可以用作解决分子表示学习问题的简单,成本效益且可靠的方法。

Learning expressive molecular representations is crucial to facilitate the accurate prediction of molecular properties. Despite the significant advancement of graph neural networks (GNNs) in molecular representation learning, they generally face limitations such as neighbors-explosion, under-reaching, over-smoothing, and over-squashing. Also, GNNs usually have high computational costs because of the large-scale number of parameters. Typically, such limitations emerge or increase when facing relatively large-size graphs or using a deeper GNN model architecture. An idea to overcome these problems is to simplify a molecular graph into a small, rich, and informative one, which is more efficient and less challenging to train GNNs. To this end, we propose a novel molecular graph coarsening framework named FunQG utilizing Functional groups, as influential building blocks of a molecule to determine its properties, based on a graph-theoretic concept called Quotient Graph. By experiments, we show that the resulting informative graphs are much smaller than the molecular graphs and thus are good candidates for training GNNs. We apply the FunQG on popular molecular property prediction benchmarks and then compare the performance of some popular baseline GNNs on the obtained datasets with the performance of several state-of-the-art baselines on the original datasets. By experiments, this method significantly outperforms previous baselines on various datasets, besides its dramatic reduction in the number of parameters and low computational costs. Therefore, the FunQG can be used as a simple, cost-effective, and robust method for solving the molecular representation learning problem.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源