论文标题
具有动态学习的神经隐式表示的多对象导航
Multi-Object Navigation with dynamically learned neural implicit representations
论文作者
论文摘要
理解和映射新的环境是任何自主导航代理的核心能力。尽管经典的机器人技术通常会以独立的方式估算巨大的变体,从而维持拓扑或度量标准,但导航的端到端学习可以在神经网络中保持某种形式的记忆。网络通常充满感应偏见,从矢量表示到鸟眼度量张量或拓扑结构。在这项工作中,我们建议构建具有两个神经隐式表示的神经网络,这些神经网络在每个情节中动态学习,并绘制场景的内容:(i)语义查找器预测先前看到的查询对象的位置; (ii)占用和探索隐式表示封装了有关探索区域和障碍的信息,并使用一种新型的全球读取机制来查询,该机制将从功能空间直接映射到可用的嵌入空间。这两种表示都由接受强化学习(RL)训练的代理商利用,并在每集中在线学习。我们在多对象导航上评估代理,并显示使用神经隐式表示作为内存源的高影响。
Understanding and mapping a new environment are core abilities of any autonomously navigating agent. While classical robotics usually estimates maps in a stand-alone manner with SLAM variants, which maintain a topological or metric representation, end-to-end learning of navigation keeps some form of memory in a neural network. Networks are typically imbued with inductive biases, which can range from vectorial representations to birds-eye metric tensors or topological structures. In this work, we propose to structure neural networks with two neural implicit representations, which are learned dynamically during each episode and map the content of the scene: (i) the Semantic Finder predicts the position of a previously seen queried object; (ii) the Occupancy and Exploration Implicit Representation encapsulates information about explored area and obstacles, and is queried with a novel global read mechanism which directly maps from function space to a usable embedding space. Both representations are leveraged by an agent trained with Reinforcement Learning (RL) and learned online during each episode. We evaluate the agent on Multi-Object Navigation and show the high impact of using neural implicit representations as a memory source.