论文标题
使用非IID数据的基于云的物联网应用程序快速准确的联合学习
Towards Fast and Accurate Federated Learning with non-IID Data for Cloud-Based IoT Applications
论文作者
论文摘要
作为在确保用户隐私的同时,作为对分散设备数据的中央模型培训的有前途的方法,联合学习(FL)在物联网(IoT)设计中变得流行。但是,当IoT设备收集的数据以非独立且分布的(非IID)方式高度偏斜时,无法保证香草FL方法的准确性。尽管存在各种解决方案,这些解决方案试图通过非IID数据来解决FL的瓶颈,但其中大多数人都遭受了额外的无法忍受的通信开销和低模型精度。为了启用快速准确的FL,本文提出了一种新型的基于数据的设备分组方法,该方法可以有效地减少非IID数据训练期间重量差异的缺点。但是,由于我们的分组方法基于从物联网设备提取的特征图的相似性,因此可能会引起隐私暴露的额外风险。为了解决此问题,我们通过使用对区域敏感的哈希(LSH)算法利用相似性信息来提出改进的版本,而无需揭示提取的特征图。关于众所周知的基准测试的全面实验结果表明,我们的方法不仅可以加速收敛率,而且还可以通过非IID数据提高FL的预测准确性。
As a promising method of central model training on decentralized device data while securing user privacy, Federated Learning (FL)is becoming popular in Internet of Things (IoT) design. However, when the data collected by IoT devices are highly skewed in a non-independent and identically distributed (non-IID) manner, the accuracy of vanilla FL method cannot be guaranteed. Although there exist various solutions that try to address the bottleneck of FL with non-IID data, most of them suffer from extra intolerable communication overhead and low model accuracy. To enable fast and accurate FL, this paper proposes a novel data-based device grouping approach that can effectively reduce the disadvantages of weight divergence during the training of non-IID data. However, since our grouping method is based on the similarity of extracted feature maps from IoT devices, it may incur additional risks of privacy exposure. To solve this problem, we propose an improved version by exploiting similarity information using the Locality-Sensitive Hashing (LSH) algorithm without exposing extracted feature maps. Comprehensive experimental results on well-known benchmarks show that our approach can not only accelerate the convergence rate, but also improve the prediction accuracy for FL with non-IID data.