论文标题

高阶组织特征可以区分疾病类别的蛋白质相互作用网络:肿瘤和神经疾病的案例研究

Higher order organizational features can distinguish protein interaction networks of disease classes: a case study of neoplasms and neurological diseases

论文作者

Singh, Vikram, Singh, Vikram

论文摘要

肿瘤(NPS)和神经系统疾病和疾病(NDD)是全世界人数不成比例的人死亡的主要疾病。为了确定在属于这两个类别的疾病开始时出现的蛋白质相互作用的局部接线模式中是否存在一些独特的特征,我们分别检查了112和175蛋白质相互作用网络,分别属于NPS和NDD。通过研究网络的本地拓扑来列举这些网络中每个网络的轨道使用概况(OUP)。 56个非冗余小组(N组)被得出并用作这两种疾病类别之间分类的网络特征。在这些数据上培训了四个机器学习分类器,即K-Nearest邻居(KNN),支持向量机(SVM),深神经网络(DNN),随机森林(RF)。在这些分类器中,DNN获得了最大的平均AUPRC(0.988)。在Node2VEC上开发的DNN,并根据六个性能测量的平均值,即AUPRC,AUPRC,准确性,灵敏度,特异性,精度和MCC比较了5倍交叉验证所提出的N组嵌入。发现基于N组的分类器在所有这六个绩效指标中的表现都更好。

Neoplasms (NPs) and neurological diseases and disorders (NDDs) are amongst the major classes of diseases underlying deaths of a disproportionate number of people worldwide. To determine if there exist some distinctive features in the local wiring patterns of protein interactions emerging at the onset of a disease belonging to either of these two classes, we examined 112 and 175 protein interaction networks belonging to NPs and NDDs, respectively. Orbit usage profiles (OUPs) for each of these networks were enumerated by investigating the networks' local topology. 56 non-redundant OUPs (nrOUPs) were derived and used as network features for classification between these two disease classes. Four machine learning classifiers, namely, k-nearest neighbour (KNN), support vector machine (SVM), deep neural network (DNN), random forest (RF) were trained on these data. DNN obtained the greatest average AUPRC (0.988) among these classifiers. DNNs developed on node2vec and the proposed nrOUPs embeddings were compared using 5-fold cross validation on the basis of average values of the six of performance measures, viz., AUPRC, Accuracy, Sensitivity, Specificity, Precision and MCC. It was found that nrOUPs based classifier performed better in all of these six performance measures.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源