论文标题

通过极端梯度提升和排名进行蛋白质结构预测的诱饵选择

Decoy Selection for Protein Structure Prediction Via Extreme Gradient Boosting and Ranking

论文作者

Akhter, Nasrin, Chennupati, Gopinath, Djidjev, Hristo, Shehu, Amarda

论文摘要

从数百万个非本地诱饵中识别一种或多种生物学活性/本地诱饵是计算结构生物学的主要挑战之一。在诱饵集中正面和负面样本(本地和非本地诱饵)中极度缺乏平衡,使问题更加复杂。共识方法在处理诱饵选择的挑战方面表现出多样化的成功,尽管与聚集的大型诱饵集和诱饵集有关,这些问题并没有显示出太多结构性相似性。最近对基于能源景观的诱饵选择方法的调查显示了承诺。但是,对于这些方法,缺乏对各种测试用例的概括仍然是一种瓶颈。我们提出了一种新颖的诱饵选择方法,ML-Select,一个机器学习框架,该框架利用与通过无模板诱饵生成探测的结构空间相关的能量景观。所提出的方法的表现优于基于聚类和基于能量排名的方法,同时始终在各种测试箱上提供更好的性能。此外,即使对于主要由低质量诱饵组成的诱饵组,ML选择也会显示出令人鼓舞的结果。 ML选择是诱饵选择的有用方法。这项工作提出了进一步的研究,以寻找更有效的方法来采用机器学习框架,以实现在无模板蛋白质结构预测中诱饵选择的稳健性能。

Identifying one or more biologically-active/native decoys from millions of non-native decoys is one of the major challenges in computational structural biology. The extreme lack of balance in positive and negative samples (native and non-native decoys) in a decoy set makes the problem even more complicated. Consensus methods show varied success in handling the challenge of decoy selection despite some issues associated with clustering large decoy sets and decoy sets that do not show much structural similarity. Recent investigations into energy landscape-based decoy selection approaches show promises. However, lack of generalization over varied test cases remains a bottleneck for these methods. We propose a novel decoy selection method, ML-Select, a machine learning framework that exploits the energy landscape associated with the structure space probed through a template-free decoy generation. The proposed method outperforms both clustering and energy ranking-based methods, all the while consistently offering better performance on varied test-cases. Moreover, ML-Select shows promising results even for the decoy sets consisting of mostly low-quality decoys. ML-Select is a useful method for decoy selection. This work suggests further research in finding more effective ways to adopt machine learning frameworks in achieving robust performance for decoy selection in template-free protein structure prediction.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源