论文标题
半监督图学习降低维度
Semi-Supervised Graph Learning Meets Dimensionality Reduction
论文作者
论文摘要
半监督学习(SSL)最近受到了机器学习研究人员的关注。通过在基于图的深度学习(GDL)算法中有效地传播已知标签,SSL有望成为未来几年中GDL中日益使用的技术。但是,目前在基于图的SSL文献中几乎没有关于利用经典维度降低技术以改善标签传播的文献。在这项工作中,我们调查了诸如PCA,T-SNE和UMAP之类的维度降低技术的使用,以查看它们对用于淋巴结标签半监督传播的图形神经网络(GNNS)的影响。我们的研究利用基准半监督的GDL数据集(例如Cora和Citeseer数据集),以便对每种算法与维度降低技术配对时学到的表示形式进行有意义的比较。我们的综合基准和质量上的聚类可视化表明,在某些条件下,在某些条件下,对GNN输入和输出分别采用了先验性和后验维度降低,可以同时提高半渗透性节点的节点标签的传播和node clustring的有效性。我们的源代码可在GitHub上免费获得。
Semi-supervised learning (SSL) has recently received increased attention from machine learning researchers. By enabling effective propagation of known labels in graph-based deep learning (GDL) algorithms, SSL is poised to become an increasingly used technique in GDL in the coming years. However, there are currently few explorations in the graph-based SSL literature on exploiting classical dimensionality reduction techniques for improved label propagation. In this work, we investigate the use of dimensionality reduction techniques such as PCA, t-SNE, and UMAP to see their effect on the performance of graph neural networks (GNNs) designed for semi-supervised propagation of node labels. Our study makes use of benchmark semi-supervised GDL datasets such as the Cora and Citeseer datasets to allow meaningful comparisons of the representations learned by each algorithm when paired with a dimensionality reduction technique. Our comprehensive benchmarks and clustering visualizations quantitatively and qualitatively demonstrate that, under certain conditions, employing a priori and a posteriori dimensionality reduction to GNN inputs and outputs, respectively, can simultaneously improve the effectiveness of semi-supervised node label propagation and node clustering. Our source code is freely available on GitHub.