生物医学中保存隐私的人工智能技术

论文标题

生物医学中保存隐私的人工智能技术

Privacy-preserving Artificial Intelligence Techniques in Biomedicine

论文作者

Torkzadehmahani, Reihaneh, Nasirigerdeh, Reza, Blumenthal, David B., Kacprowski, Tim, List, Markus, Matschinske, Julian, Späth, Julian, Wenke, Nina Kerstin, Bihari, Béla, Frisch, Tobias, Hartebrodt, Anne, Hausschild, Anne-Christin, Heider, Dominik, Holzinger, Andreas, Hötzendorfer, Walter, Kastelitz, Markus, Mayer, Rudolf, Nogales, Cristian, Pustozerova, Anastasia, Röttger, Richard, Schmidt, Harald H. H. W., Schwalber, Ameli, Tschohl, Christof, Wohner, Andrea, Baumbach, Jan

论文摘要

人工智能（AI）已成功应用于众多科学领域。在生物医学中，AI已经显示出巨大的潜力，例如在解释下一代测序数据和临床决策支持系统的设计中。但是，培训AI模型的敏感数据引起了人们对个人参与者隐私的担忧。例如，全基因组关联研究的摘要统计数据可用于确定给定数据集中的个人的存在或不存在。这种巨大的隐私风险导致了访问基因组和其他生物医学数据的限制，这对协作研究有害并阻碍了科学进步。因此，已经付出了巨大的努力来开发可以从敏感数据中学习的同时保护个人隐私的方法。本文概述了生物医学中保护隐私AI技术的最新进展。它将最重要的最新方法置于统一的分类法中，并讨论他们的优势，局限性和开放问题。作为最有希望的方向，我们建议将联合机器学习与其他其他隐私保护技术相结合。这将允许合并优势，以分布式方式为生物医学应用提供隐私担保。但是，由于混合方法提出了新的挑战，例如其他网络或计算开销，因此需要更多的研究。

Artificial intelligence (AI) has been successfully applied in numerous scientific domains. In biomedicine, AI has already shown tremendous potential, e.g. in the interpretation of next-generation sequencing data and in the design of clinical decision support systems. However, training an AI model on sensitive data raises concerns about the privacy of individual participants. For example, summary statistics of a genome-wide association study can be used to determine the presence or absence of an individual in a given dataset. This considerable privacy risk has led to restrictions in accessing genomic and other biomedical data, which is detrimental for collaborative research and impedes scientific progress. Hence, there has been a substantial effort to develop AI methods that can learn from sensitive data while protecting individuals' privacy. This paper provides a structured overview of recent advances in privacy-preserving AI techniques in biomedicine. It places the most important state-of-the-art approaches within a unified taxonomy and discusses their strengths, limitations, and open problems. As the most promising direction, we suggest combining federated machine learning as a more scalable approach with other additional privacy preserving techniques. This would allow to merge the advantages to provide privacy guarantees in a distributed way for biomedical applications. Nonetheless, more research is necessary as hybrid approaches pose new challenges such as additional network or computation overhead.

下载PDF全文

下载文献需遵守相关版权规定

论文标题