论文标题

一个用于预测新基因型表型的跨级信息传输网络:应用于癌症精密医学

A Cross-Level Information Transmission Network for Predicting Phenotype from New Genotype: Application to Cancer Precision Medicine

论文作者

He, Di, Xie, Lei

论文摘要

生物学和生态学中未解决的基本问题是预测在环境扰动下的生物体的新遗传构成(基因型)(例如药物治疗)中的可观察性状(表型)。多个OMIC数据的出现提供了新的机会,但在基因型 - 表型关联的预测建模中构成了巨大的挑战。首先,基因组学数据的高维度和缺乏标记的数据通常会使现有的监督学习技术不太成功。其次,从不同资源整合异质的OMIC数据是一项具有挑战性的任务。最后,从DNA到表型的信息传输涉及多个中间水平的RNA,蛋白质,代谢物等。高级特征(例如,基因表达)通常具有比较低水平特征(例如体细胞突变)更强的判别能力。为了解决上述问题,我们提出了一个新颖的跨级信息传输网络(CLEIT)框架。 CLEIT旨在明确对生物系统的不对称多层组织进行建模。受域适应的启发,Cleit首先了解了高级域的潜在表示,然后将其用作地面真相嵌入,以改善对比损失形式的低级域的表示。此外,我们采用了一种预训练的调整方法来利用未标记的异质磁磁性数据来提高CLEIT的普遍性。与最先进的方法相比,我们通过基因表达的辅助来证明CLEIT在预测抗癌药物敏感性方面的有效性和性能提高。

An unsolved fundamental problem in biology and ecology is to predict observable traits (phenotypes) from a new genetic constitution (genotype) of an organism under environmental perturbations (e.g., drug treatment). The emergence of multiple omics data provides new opportunities but imposes great challenges in the predictive modeling of genotype-phenotype associations. Firstly, the high-dimensionality of genomics data and the lack of labeled data often make the existing supervised learning techniques less successful. Secondly, it is a challenging task to integrate heterogeneous omics data from different resources. Finally, the information transmission from DNA to phenotype involves multiple intermediate levels of RNA, protein, metabolite, etc. The higher-level features (e.g., gene expression) usually have stronger discriminative power than the lower level features (e.g., somatic mutation). To address above issues, we proposed a novel Cross-LEvel Information Transmission network (CLEIT) framework. CLEIT aims to explicitly model the asymmetrical multi-level organization of the biological system. Inspired by domain adaptation, CLEIT first learns the latent representation of high-level domain then uses it as ground-truth embedding to improve the representation learning of the low-level domain in the form of contrastive loss. In addition, we adopt a pre-training-fine-tuning approach to leveraging the unlabeled heterogeneous omics data to improve the generalizability of CLEIT. We demonstrate the effectiveness and performance boost of CLEIT in predicting anti-cancer drug sensitivity from somatic mutations via the assistance of gene expressions when compared with state-of-the-art methods.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源