最后一英里体现的视觉导航

论文标题

最后一英里体现的视觉导航

Last-Mile Embodied Visual Navigation

论文作者

Wasserman, Justin, Yadav, Karmesh, Chowdhary, Girish, Gupta, Abhinav, Jain, Unnat

论文摘要

现实的长马式任务（例如图像目标导航）涉及探索性和剥削阶段。用目标的图像分配，体现的代理必须探索以发现目标，即使用学识渊博的先验有效地搜索。一旦发现了目标，代理必须准确地将导航的最后一英里校准到目标。与任何强大的系统一样，探索性目标发现与剥削性最后一英里导航之间的切换可以更好地从错误中恢复。遵循这些直观的指南导轨，我们建议吊索以提高现有图像目标导航系统的性能。我们完全补充了先前的方法，我们专注于最后一英里导航，并利用神经描述符的问题的基本几何结构。通过简单但有效的开关，我们可以轻松地将吊索与启发式，增强学习和神经模块化策略联系起来。在标准化的图像目标导航基准（Hahn等，2021年）上，我们提高了策略，场景和发作复杂性的性能，将最新技术从45％提高到55％的成功率。除了逼真的模拟之外，我们还在三个物理场景中进行了实体实验实验，并找到了这些改进以将其转移到真实环境中。

Realistic long-horizon tasks like image-goal navigation involve exploratory and exploitative phases. Assigned with an image of the goal, an embodied agent must explore to discover the goal, i.e., search efficiently using learned priors. Once the goal is discovered, the agent must accurately calibrate the last-mile of navigation to the goal. As with any robust system, switches between exploratory goal discovery and exploitative last-mile navigation enable better recovery from errors. Following these intuitive guide rails, we propose SLING to improve the performance of existing image-goal navigation systems. Entirely complementing prior methods, we focus on last-mile navigation and leverage the underlying geometric structure of the problem with neural descriptors. With simple but effective switches, we can easily connect SLING with heuristic, reinforcement learning, and neural modular policies. On a standardized image-goal navigation benchmark (Hahn et al. 2021), we improve performance across policies, scenes, and episode complexity, raising the state-of-the-art from 45% to 55% success rate. Beyond photorealistic simulation, we conduct real-robot experiments in three physical scenes and find these improvements to transfer well to real environments.

下载PDF全文

下载文献需遵守相关版权规定

论文标题