论文标题

重点:通过代理意识公平地对联合学习的数据

FOCUS: Fairness via Agent-Awareness for Federated Learning on Heterogeneous Data

论文作者

Chu, Wenda, Xie, Chulin, Wang, Boxin, Li, Linyi, Yin, Lang, Nourian, Arash, Zhao, Han, Li, Bo

论文摘要

联合学习(FL)允许代理商共同训练全球模型而无需共享本地数据。但是,由于本地数据的异质性质,优化甚至定义了训练有素的代理商全球模型的公平性是一项挑战。例如,现有工作通常将准确性公平视为FL中不同代理的公平性,这是有限的,尤其是在异质环境下,因为直观地执行具有高质量数据的代理商是“不公平”的,以实现与贡献低质量数据的人相似的准确性,这可能会阻止代理人参与FL。在这项工作中,我们提出了正式的FL公平定义,通过代理意识(FAA)公平,该公平性考虑了异质代理的不同贡献。在FAA下,仅由于存在大量具有低质量数据的代理商,具有高质量数据的代理商的性能不会被牺牲。此外,我们提出了一种基于代理聚类(焦点)的公平FL培训算法,以实现FAA测量的FL中的公平性。从理论上讲,我们证明了焦点在线性和一般凸损失函数的轻度条件下的收敛性和最佳性,具有有界平滑度。我们还证明,在线性和一般凸损失函数下,与标准FedAvg相比,FAA的焦点总是在FAA方面达到更高的公平性。从经验上,我们表明,在四个FL数据集(包括合成数据,图像和文本)上,与FedAvg和先进的公平FL算法相比,FAA的焦点在FAA方面取得了明显更高的公平性。

Federated learning (FL) allows agents to jointly train a global model without sharing their local data. However, due to the heterogeneous nature of local data, it is challenging to optimize or even define fairness of the trained global model for the agents. For instance, existing work usually considers accuracy equity as fairness for different agents in FL, which is limited, especially under the heterogeneous setting, since it is intuitively "unfair" to enforce agents with high-quality data to achieve similar accuracy to those who contribute low-quality data, which may discourage the agents from participating in FL. In this work, we propose a formal FL fairness definition, fairness via agent-awareness (FAA), which takes different contributions of heterogeneous agents into account. Under FAA, the performance of agents with high-quality data will not be sacrificed just due to the existence of large amounts of agents with low-quality data. In addition, we propose a fair FL training algorithm based on agent clustering (FOCUS) to achieve fairness in FL measured by FAA. Theoretically, we prove the convergence and optimality of FOCUS under mild conditions for linear and general convex loss functions with bounded smoothness. We also prove that FOCUS always achieves higher fairness in terms of FAA compared with standard FedAvg under both linear and general convex loss functions. Empirically, we show that on four FL datasets, including synthetic data, images, and texts, FOCUS achieves significantly higher fairness in terms of FAA while maintaining competitive prediction accuracy compared with FedAvg and state-of-the-art fair FL algorithms.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源