论文标题
开发用于免疫学自动蛋白质分析的数据分析管道
Developing a data analysis pipeline for automated protein profiling in immunology
论文作者
论文摘要
有关生物体蛋白质含量的准确信息对更好地理解人类生物学和疾病机制具有重要意义。尽管某些类型的蛋白质的存在可能会威胁生命,但他人的丰度是个人整体福祉的必要条件。蛋白质微阵列是一项技术,可以以平行方式定量数百种人类样品中的数千种蛋白质。在一系列涉及蛋白质微阵列的研究中,我们探索并实施了各种数据科学方法,以全面分析这些数据。该分析使APS1疾病患者的自身免疫反应靶向靶向的蛋白质可以鉴定和表征。我们还评估了基于蛋白质表达数据的研究中应用机器学习方法以及统计检验的实用性,以评估子宫内膜异位症的潜在生物标志物。这项工作的基石是网络工具杂货商。 Pawer实现了相关的计算方法,并提供了一种半自动方法来以拖放和点击播放方式在线运行蛋白质微阵列数据的分析。该工具的源代码可公开可用。奠定了本文基础的工作对许多随后的人类疾病研究起了重要作用,并且还激发了对生物学机器学习方法验证的精炼标准的贡献。
Accurate information about protein content in the organism is instrumental for a better understanding of human biology and disease mechanisms. While the presence of certain types of proteins can be life-threatening, the abundance of others is an essential condition for an individual's overall well-being. Protein microarray is a technology that enables the quantification of thousands of proteins in hundreds of human samples in a parallel manner. In a series of studies involving protein microarrays, we have explored and implemented various data science methods for all-around analysing of these data. This analysis has enabled the identification and characterisation of proteins targeted by the autoimmune reaction in patients with the APS1 condition. We have also assessed the utility of applying machine learning methods alongside statistical tests in a study based on protein expression data to evaluate potential biomarkers for endometriosis. The keystone of this work is a web-tool PAWER. PAWER implements relevant computational methods, and provides a semi-automatic way to run the analysis of protein microarray data online in a drag-and-drop and click-and-play style. The source code of the tool is publicly available. The work that laid the foundation of this thesis has been instrumental for a number of subsequent studies of human disease and also inspired a contribution to refining standards for validation of machine learning methods in biology.