BlindFL：垂直联合机器学习而无需窥视您的数据

论文标题

BlindFL：垂直联合机器学习而无需窥视您的数据

BlindFL: Vertical Federated Machine Learning without Peeking into Your Data

论文作者

Fu, Fangcheng, Xue, Huanran, Cheng, Yong, Tao, Yangyu, Cui, Bin

论文摘要

由于对隐私保护的关注不断增加，因此如何在具有安全保证的不同数据源上建立机器学习（ML）模型正在越来越受欢迎。垂直联合学习（VFL）描述了这种情况，在这种情况下，ML模型建立在不同参与方的私人数据上，该数据与同一集合相同的实例中具有脱节功能，这适合许多现实世界中的协作任务。然而，我们发现VFL现有的解决方案要么支持有限的输入功能，要么在联合执行过程中遭受潜在数据泄漏的影响。为此，本文旨在研究VFL方案中ML模式的功能和安全性。具体来说，我们介绍了BlindFL，这是VFL训练和推理的新型框架。首先，为了解决VFL模型的功能，我们提出了联合源层，以团结不同各方的数据。联合源层可以有效地支持各种特征，包括密集，稀疏，数值和分类特征。其次，我们在联合执行期间仔细分析了安全性，并正式化了隐私要求。基于分析，我们设计了安全，准确的算法协议，并进一步证明了在理想真实的模拟范式下的安全保证。广泛的实验表明，BlindFL支持各种数据集和模型，同时可以实现强大的隐私保证。

Due to the rising concerns on privacy protection, how to build machine learning (ML) models over different data sources with security guarantees is gaining more popularity. Vertical federated learning (VFL) describes such a case where ML models are built upon the private data of different participated parties that own disjoint features for the same set of instances, which fits many real-world collaborative tasks. Nevertheless, we find that existing solutions for VFL either support limited kinds of input features or suffer from potential data leakage during the federated execution. To this end, this paper aims to investigate both the functionality and security of ML modes in the VFL scenario. To be specific, we introduce BlindFL, a novel framework for VFL training and inference. First, to address the functionality of VFL models, we propose the federated source layers to unite the data from different parties. Various kinds of features can be supported efficiently by the federated source layers, including dense, sparse, numerical, and categorical features. Second, we carefully analyze the security during the federated execution and formalize the privacy requirements. Based on the analysis, we devise secure and accurate algorithm protocols, and further prove the security guarantees under the ideal-real simulation paradigm. Extensive experiments show that BlindFL supports diverse datasets and models efficiently whilst achieves robust privacy guarantees.

下载PDF全文

下载文献需遵守相关版权规定

论文标题