辅助任务加快学习点目标导航

论文标题

辅助任务加快学习点目标导航

Auxiliary Tasks Speed Up Learning PointGoal Navigation

论文作者

Ye, Joel, Batra, Dhruv, Wijmans, Erik, Das, Abhishek

论文摘要

PointGoal Navigation是一个具体的任务，要求代理在看不见的环境中导航到指定点。 Wijmans等。表明该任务是可以解决的，但它们的方法在计算上是过时的，需要25亿帧和180个GPU周日。 In this work, we develop a method to significantly increase sample and time efficiency in learning PointNav using self-supervised auxiliary tasks (e.g. predicting the action taken between two egocentric observations, predicting the distance between two observations from a trajectory,etc.).We find that naively combining multiple auxiliary tasks improves sample efficiency,but only provides marginal gains beyond a point.为了克服这一点，我们使用注意力结合了从单个辅助任务中学到的表示形式。我们最好的代理商是快5.5倍，以40m帧的速度达到先前最先进的DD-PPO的性能，并且DD-PPPO的性能在40m帧中提高了0.16 SPL。我们的代码可在https://github.com/joel99/habitat-pointnav-aux上公开获取。

PointGoal Navigation is an embodied task that requires agents to navigate to a specified point in an unseen environment. Wijmans et al. showed that this task is solvable but their method is computationally prohibitive, requiring 2.5 billion frames and 180 GPU-days. In this work, we develop a method to significantly increase sample and time efficiency in learning PointNav using self-supervised auxiliary tasks (e.g. predicting the action taken between two egocentric observations, predicting the distance between two observations from a trajectory,etc.).We find that naively combining multiple auxiliary tasks improves sample efficiency,but only provides marginal gains beyond a point. To overcome this, we use attention to combine representations learnt from individual auxiliary tasks. Our best agent is 5.5x faster to reach the performance of the previous state-of-the-art, DD-PPO, at 40M frames, and improves on DD-PPO's performance at 40M frames by 0.16 SPL. Our code is publicly available at https://github.com/joel99/habitat-pointnav-aux.

下载PDF全文

下载文献需遵守相关版权规定

论文标题