一个私人概率框架，用于建模异构多视图观测值联合数据集的变异性

论文标题

一个私人概率框架，用于建模异构多视图观测值联合数据集的变异性

A Differentially Private Probabilistic Framework for Modeling the Variability Across Federated Datasets of Heterogeneous Multi-View Observations

论文作者

Balelli, Irene, Silva, Santiago, Lorenzi, Marco

论文摘要

我们提出了一个新型的联邦学习范式，以在多中心研究中对异质客户的数据变异性进行建模。我们的方法是通过层次贝叶斯潜在变量模型表示的，其中假定客户特定参数是从主级别的全局分布中实现的，这反过来估计可以说明客户端的数据偏差和可变性。我们证明，可以通过对潜在主人的分布和客户的参数来有效地优化我们的框架。我们还引入正式差异隐私（DP）与我们的EM优化方案兼容。我们在分析受阿尔茨海默氏病影响的患者分布的临床数据集的多模式医学成像数据和临床评分分析时测试了我们的方法。我们证明，即使包括局部参数扰动以提供DP保证，即使数据分布在IID和非IID举止中时，我们的方法也很强。此外，与最先进的自动编码模型和联合学习方案相比，可以以可解释的方式量化数据，视图和中心的可变性，同时保证高质量的数据重建。该代码可在https://gitlab.inria.fr/epione/federated-multi-views-ppca上找到。

We propose a novel federated learning paradigm to model data variability among heterogeneous clients in multi-centric studies. Our method is expressed through a hierarchical Bayesian latent variable model, where client-specific parameters are assumed to be realization from a global distribution at the master level, which is in turn estimated to account for data bias and variability across clients. We show that our framework can be effectively optimized through expectation maximization (EM) over latent master's distribution and clients' parameters. We also introduce formal differential privacy (DP) guarantees compatibly with our EM optimization scheme. We tested our method on the analysis of multi-modal medical imaging data and clinical scores from distributed clinical datasets of patients affected by Alzheimer's disease. We demonstrate that our method is robust when data is distributed either in iid and non-iid manners, even when local parameters perturbation is included to provide DP guarantees. Moreover, the variability of data, views and centers can be quantified in an interpretable manner, while guaranteeing high-quality data reconstruction as compared to state-of-the-art autoencoding models and federated learning schemes. The code is available at https://gitlab.inria.fr/epione/federated-multi-views-ppca.

下载PDF全文

下载文献需遵守相关版权规定

论文标题