在编队方面前进：一种分散的分层学习方法，用于多代理一起移动

论文标题

在编队方面前进：一种分散的分层学习方法，用于多代理一起移动

Moving Forward in Formation: A Decentralized Hierarchical Learning Approach to Multi-Agent Moving Together

论文作者

Liu, Shanqi, Wen, Licheng, Cui, Jinhao, Yang, Xuemeng, Cao, Junjie, Liu, Yong

论文摘要

编队中的多代理路径发现具有许多潜在的现实应用程序，例如移动仓库机器人。但是，以前的多试路径查找（MAPF）方法几乎不考虑地层。此外，他们通常是集中的计划者，需要整个环境状态。 MAPF的其他部分可观察到的方法是加强学习（RL）方法。但是，这些RL方法同时学习路径查找和形成问题时会遇到困难。在本文中，我们提出了一种新型的分散的部分可观察到的RL算法，该算法使用层次结构将多客观任务分解为无关的任务。它还计算一个理论权重，使每个任务奖励对最终RL值函数的影响相等。此外，我们引入了一种通信方法，该方法可以帮助代理人相互合作。模拟中的实验表明，我们的方法的表现优于其他端到端RL方法，我们的方法自然可以扩展到集中规划师挣扎的大型世界大小。我们还将在现实世界中部署和验证我们的方法。

Multi-agent path finding in formation has many potential real-world applications like mobile warehouse robots. However, previous multi-agent path finding (MAPF) methods hardly take formation into consideration. Furthermore, they are usually centralized planners and require the whole state of the environment. Other decentralized partially observable approaches to MAPF are reinforcement learning (RL) methods. However, these RL methods encounter difficulties when learning path finding and formation problem at the same time. In this paper, we propose a novel decentralized partially observable RL algorithm that uses a hierarchical structure to decompose the multi objective task into unrelated ones. It also calculates a theoretical weight that makes every task reward has equal influence on the final RL value function. Additionally, we introduce a communication method that helps agents cooperate with each other. Experiments in simulation show that our method outperforms other end-to-end RL methods and our method can naturally scale to large world sizes where centralized planner struggles. We also deploy and validate our method in a real world scenario.

下载PDF全文

下载文献需遵守相关版权规定

论文标题