实体分隔线与多代理强化学习中的语言基础

论文标题

实体分隔线与多代理强化学习中的语言基础

Entity Divider with Language Grounding in Multi-Agent Reinforcement Learning

论文作者

Ding, Ziluo, Zhang, Wanpeng, Yue, Junpeng, Wang, Xiangjun, Huang, Tiejun, Lu, Zongqing

论文摘要

我们研究了自然语言在多代理设置中推动策略的概括。与单一代理设置不同，政策的概括也应考虑其他代理的影响。此外，随着多代理设置中的实体数量的越来越多，语言接地需要更多的代理实体交互，并且巨大的搜索空间可能会阻碍学习过程。此外，考虑到一个简单的一般指令，例如击败所有敌人，需要将其分解为多个子目标，并找出正确的重点。受到先前工作的启发，我们试图解决实体层面上的这些问题，并提出了一个新颖的框架，以在多代理强化学习中进行语言基础，实体分层（ENDI）。 Endi使代理商能够在实体层面独立学习子目标，并基于相关实体在环境中行动。对手建模是为了避免子目标冲突并促进协调策略的正规部门。从经验上讲，恩迪（Endi）展示了具有新的动态游戏的强大概括能力，并表达了比现有方法的优越性。

We investigate the use of natural language to drive the generalization of policies in multi-agent settings. Unlike single-agent settings, the generalization of policies should also consider the influence of other agents. Besides, with the increasing number of entities in multi-agent settings, more agent-entity interactions are needed for language grounding, and the enormous search space could impede the learning process. Moreover, given a simple general instruction,e.g., beating all enemies, agents are required to decompose it into multiple subgoals and figure out the right one to focus on. Inspired by previous work, we try to address these issues at the entity level and propose a novel framework for language grounding in multi-agent reinforcement learning, entity divider (EnDi). EnDi enables agents to independently learn subgoal division at the entity level and act in the environment based on the associated entities. The subgoal division is regularized by opponent modeling to avoid subgoal conflicts and promote coordinated strategies. Empirically, EnDi demonstrates the strong generalization ability to unseen games with new dynamics and expresses the superiority over existing methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题