通过局部差异最小化，在大型多代理系统中学习单个政策

论文标题

通过局部差异最小化，在大型多代理系统中学习单个政策

Learning Individual Policies in Large Multi-agent Systems through Local Variance Minimization

论文作者

Verma, Tanvi, Varakantham, Pradeep

论文摘要

在具有大量代理的多机构系统中，通常每个代理对其他代理的价值的贡献很小（例如，Uber，deliveroo等聚合系统）。在本文中，我们考虑了每个代理人是自我利益并采取一系列决策并将其表示为随机非原子拥塞游戏（SNCG）的多代理系统。我们得出了具有非原子和几乎非原子剂的SNCG模型中平衡溶液的关键特性。借助这些关键的平衡特性，我们提供了一种新型的多代理增强学习（MARL）机制，可最大程度地减少同一状态下代理值的差异。为了证明这种新机制的实用性，我们在现实世界中的出租车数据集以及聚合系统的通用模拟器上提供了详细的结果。我们表明，我们的方法减少了出租车司机所获得的收入的差异，同时仍然提供比领先方法更高的关节收入。

In multi-agent systems with large number of agents, typically the contribution of each agent to the value of other agents is minimal (e.g., aggregation systems such as Uber, Deliveroo). In this paper, we consider such multi-agent systems where each agent is self-interested and takes a sequence of decisions and represent them as a Stochastic Non-atomic Congestion Game (SNCG). We derive key properties for equilibrium solutions in SNCG model with non-atomic and also nearly non-atomic agents. With those key equilibrium properties, we provide a novel Multi-Agent Reinforcement Learning (MARL) mechanism that minimizes variance across values of agents in the same state. To demonstrate the utility of this new mechanism, we provide detailed results on a real-world taxi dataset and also a generic simulator for aggregation systems. We show that our approach reduces the variance in revenues earned by taxi drivers, while still providing higher joint revenues than leading approaches.

下载PDF全文

下载文献需遵守相关版权规定

论文标题