探索深入的强化学习辅助联盟学习，以在线资源分配中的私人资源分配

论文标题

探索深入的强化学习辅助联盟学习，以在线资源分配中的私人资源分配

Exploring Deep Reinforcement Learning-Assisted Federated Learning for Online Resource Allocation in Privacy-Persevering EdgeIoT

论文作者

Zheng, Jingjing, Li, Kai, Mhaisen, Naram, Ni, Wei, Tovar, Eduardo, Guizani, Mohsen

论文摘要

在基于移动边缘计算的物体Internet（Edgeiot）中，联合学习（FL）越来越多地考虑保留数据培训隐私。一方面，通过选择具有大型培训数据集的物联网设备，可以提高FL的学习精度，从而导致更高的能耗。另一方面，可以通过选择使用小型数据集的IoT设备来减少能源消耗，从而导致学习准确性下降。在本文中，我们为隐私范围的Edgeiot制定了一个新的资源分配问题，以平衡FL的学习准确性和物联网设备的能耗。我们提出了一个新的联合学习支持双胞胎的深层确定性政策梯度（FL-DLT3）框架，以实现连续域中的最佳精度和能量平衡。此外，在FL-DLT3中利用长期短期内存（LSTM）来预测时间变化的网络状态，而FL-DLT3经过训练以选择IoT设备并分配发射功率。数值结果表明，与现有的先进基准相比，所提出的FLT3实现快速收敛（小于100迭代），而FL精度与能量消耗率相比提高了51.8％。

Federated learning (FL) has been increasingly considered to preserve data training privacy from eavesdropping attacks in mobile edge computing-based Internet of Thing (EdgeIoT). On the one hand, the learning accuracy of FL can be improved by selecting the IoT devices with large datasets for training, which gives rise to a higher energy consumption. On the other hand, the energy consumption can be reduced by selecting the IoT devices with small datasets for FL, resulting in a falling learning accuracy. In this paper, we formulate a new resource allocation problem for privacy-persevering EdgeIoT to balance the learning accuracy of FL and the energy consumption of the IoT device. We propose a new federated learning-enabled twin-delayed deep deterministic policy gradient (FL-DLT3) framework to achieve the optimal accuracy and energy balance in a continuous domain. Furthermore, long short term memory (LSTM) is leveraged in FL-DLT3 to predict the time-varying network state while FL-DLT3 is trained to select the IoT devices and allocate the transmit power. Numerical results demonstrate that the proposed FL-DLT3 achieves fast convergence (less than 100 iterations) while the FL accuracy-to-energy consumption ratio is improved by 51.8% compared to existing state-of-the-art benchmark.

下载PDF全文

下载文献需遵守相关版权规定

论文标题