对象目标导航使用数据正则Q学习

论文标题

对象目标导航使用数据正则Q学习

Object Goal Navigation using Data Regularized Q-Learning

论文作者

Gireesh, Nandiraju, Kiran, D. A. Sasi, Banerjee, Snehasis, Sridharan, Mohan, Bhowmick, Brojeshwar, Krishna, Madhava

论文摘要

对象目标导航要求机器人在以前看不见的环境中找到并导航到目标对象类的实例。我们的框架会随着时间的推移逐步构建环境的语义图，然后根据语义映射重复选择一个长期目标（“ where to Go”）以找到目标对象实例。长期目标选择被称为基于视觉的深入强化学习问题。具体而言，训练编码器网络可以从语义图中提取高级功能并选择长期目标。此外，我们还将数据增强和Q功能正则化结合起来，以使长期目标选择更有效。我们在AI栖息地3D仿真环境中使用照片现实的Gibson基准数据集进行了实验结果，以证明与最先进的数据驱动基线相比，标准测量的性能改善。

Object Goal Navigation requires a robot to find and navigate to an instance of a target object class in a previously unseen environment. Our framework incrementally builds a semantic map of the environment over time, and then repeatedly selects a long-term goal ('where to go') based on the semantic map to locate the target object instance. Long-term goal selection is formulated as a vision-based deep reinforcement learning problem. Specifically, an Encoder Network is trained to extract high-level features from a semantic map and select a long-term goal. In addition, we incorporate data augmentation and Q-function regularization to make the long-term goal selection more effective. We report experimental results using the photo-realistic Gibson benchmark dataset in the AI Habitat 3D simulation environment to demonstrate substantial performance improvement on standard measures in comparison with a state of the art data-driven baseline.

下载PDF全文

下载文献需遵守相关版权规定

论文标题