用于学习平均现场游戏的熵正规化

论文标题

用于学习平均现场游戏的熵正规化

Entropy Regularization for Mean Field Games with Learning

论文作者

Guo, Xin, Xu, Renyuan, Zariphopoulou, Thaleia

论文摘要

熵正规化已被广泛采用，以提高算法在增强学习中的效率，稳定性和收敛性。本文通过定量和定性地分析了在有限的时间范围内进行学习的熵正则对平均野外游戏（MFG）的影响。我们的研究提供了理论上的理由，即熵正则得出时间依赖性策略，并有助于稳定和加速与游戏均衡的收敛。此外，这项研究导致了在MFG中进行探索的策略级算法。在此算法下，代理可以学习最佳探索计划，并具有稳定而快速的收敛范围。

Entropy regularization has been extensively adopted to improve the efficiency, the stability, and the convergence of algorithms in reinforcement learning. This paper analyzes both quantitatively and qualitatively the impact of entropy regularization for Mean Field Game (MFG) with learning in a finite time horizon. Our study provides a theoretical justification that entropy regularization yields time-dependent policies and, furthermore, helps stabilizing and accelerating convergence to the game equilibrium. In addition, this study leads to a policy-gradient algorithm for exploration in MFG. Under this algorithm, agents are able to learn the optimal exploration scheduling, with stable and fast convergence to the game equilibrium.

下载PDF全文

下载文献需遵守相关版权规定

论文标题