从未标记的流中心学习的自我监督的开发项目

论文标题

从未标记的流中心学习的自我监督的开发项目

Self-supervised On-device Federated Learning from Unlabeled Streams

论文作者

Shi, Jiahe, Wu, Yawen, Zeng, Dewen, Tao, Jun, Hu, Jingtong, Shi, Yiyu

论文摘要

边缘设备的普遍性导致边缘生产的无标记数据越来越多。部署在边缘设备上的深度学习模型必须从这些未标记的数据中学习，以不断提高准确性。使用集中式未标记的数据，自我监督的表示学习实现了有希望的表现。但是，对隐私保护的越来越多的意识限制了在边缘设备上集中分布式未标记的图像数据。尽管联合学习已被广泛采用，以启用具有隐私保护的分布式机器学习，而没有数据选择方法可以有效地选择流媒体数据，但传统的联合学习框架未能处理这些大量分散的未标记的未标记的数据，而Edge的存储资源有限。为了应对这些挑战，我们提出了一个使用CoreSet选择的自我监督的在设备联合学习框架（我们称为SOFED）中，以自动选择一个由最具代表性的样本组成的核心，将每个设备的重播缓冲区组成。它保留了数据隐私，因为每个客户在学习良好的视觉表示时都不共享原始数据。实验证明了拟议方法在视觉表示学习中的有效性和意义。

The ubiquity of edge devices has led to a growing amount of unlabeled data produced at the edge. Deep learning models deployed on edge devices are required to learn from these unlabeled data to continuously improve accuracy. Self-supervised representation learning has achieved promising performances using centralized unlabeled data. However, the increasing awareness of privacy protection limits centralizing the distributed unlabeled image data on edge devices. While federated learning has been widely adopted to enable distributed machine learning with privacy preservation, without a data selection method to efficiently select streaming data, the traditional federated learning framework fails to handle these huge amounts of decentralized unlabeled data with limited storage resources on edge. To address these challenges, we propose a Self-supervised On-device Federated learning framework with coreset selection, which we call SOFed, to automatically select a coreset that consists of the most representative samples into the replay buffer on each device. It preserves data privacy as each client does not share raw data while learning good visual representations. Experiments demonstrate the effectiveness and significance of the proposed method in visual representation learning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题