差异意识的联邦自我监督学习

论文标题

差异意识的联邦自我监督学习

Divergence-aware Federated Self-Supervised Learning

论文作者

Zhuang, Weiming, Wen, Yonggang, Zhang, Shuai

论文摘要

自我监督学习（SSL）能够从中央可用的数据中学习出色的表示。最近的工作进一步实施了与SSL的联合学习，以从迅速增长的分散的未标记图像（例如，来自相机和手机）中学习，通常是由于隐私限制而产生的。已经对基于暹罗网络的SSL方法广泛关注。但是，这样的努力尚未揭示出对联邦自学学习（FEDSSL）体系结构的各种基本构建基础的深刻见解。我们旨在通过深入的实证研究来填补这一空白，并提出一种新方法来解决分散数据的非独立和相同分布的（非IID）数据问题。首先，我们介绍了一个广义的FEDSS框架，该框架采用了基于暹罗网络的现有SSL方法，并提供了适合未来方法的灵活性。在此框架中，服务器协调多个客户端进行SSL培训，并通过汇总的全局模型定期更新客户端的本地模型。使用该框架，我们的研究发现了FEDSSL的独特见解：1）以前据报道是必不可少的定格梯度操作，在FedSSL中并不总是必要的； 2）在FEDSSL中保留客户的本地知识对非IID数据特别有益。受洞察力的启发，我们提出了一种新的模型更新方法，联合差异指数的移动平均更新（FEDEMA）。 Fedema使用全局模型的EMA自适应地更新客户端的本地模型，在该模型中，衰减率是通过模型差异动态测量的。广泛的实验表明，在线性评估中，Fedema的表现优于现有方法3-4％。我们希望这项工作将为将来的研究提供有用的见解。

Self-supervised learning (SSL) is capable of learning remarkable representations from centrally available data. Recent works further implement federated learning with SSL to learn from rapidly growing decentralized unlabeled images (e.g., from cameras and phones), often resulted from privacy constraints. Extensive attention has been paid to SSL approaches based on Siamese networks. However, such an effort has not yet revealed deep insights into various fundamental building blocks for the federated self-supervised learning (FedSSL) architecture. We aim to fill in this gap via in-depth empirical study and propose a new method to tackle the non-independently and identically distributed (non-IID) data problem of decentralized data. Firstly, we introduce a generalized FedSSL framework that embraces existing SSL methods based on Siamese networks and presents flexibility catering to future methods. In this framework, a server coordinates multiple clients to conduct SSL training and periodically updates local models of clients with the aggregated global model. Using the framework, our study uncovers unique insights of FedSSL: 1) stop-gradient operation, previously reported to be essential, is not always necessary in FedSSL; 2) retaining local knowledge of clients in FedSSL is particularly beneficial for non-IID data. Inspired by the insights, we then propose a new approach for model update, Federated Divergence-aware Exponential Moving Average update (FedEMA). FedEMA updates local models of clients adaptively using EMA of the global model, where the decay rate is dynamically measured by model divergence. Extensive experiments demonstrate that FedEMA outperforms existing methods by 3-4% on linear evaluation. We hope that this work will provide useful insights for future research.

下载PDF全文

下载文献需遵守相关版权规定

论文标题