论文标题
视觉和语言导航:对任务,方法和未来方向的调查
Vision-and-Language Navigation: A Survey of Tasks, Methods, and Future Directions
论文作者
论文摘要
人工智能研究的一个长期目标是建立可以用自然语言与人类交流的智能代理,感知环境并执行现实世界的任务。视觉和语言导航(VLN)是针对该目标的基本和跨学科研究主题,并受到自然语言处理,计算机视觉,机器人技术和机器学习社区的关注。在本文中,我们回顾了VLN新兴领域的当代研究,涵盖了任务,评估指标,方法等。通过对当前进步和挑战的结构化分析,我们重点介绍了当前VLN的局限性以及未来工作的机会。本文是VLN研究社区的详尽参考。
A long-term goal of AI research is to build intelligent agents that can communicate with humans in natural language, perceive the environment, and perform real-world tasks. Vision-and-Language Navigation (VLN) is a fundamental and interdisciplinary research topic towards this goal, and receives increasing attention from natural language processing, computer vision, robotics, and machine learning communities. In this paper, we review contemporary studies in the emerging field of VLN, covering tasks, evaluation metrics, methods, etc. Through structured analysis of current progress and challenges, we highlight the limitations of current VLN and opportunities for future work. This paper serves as a thorough reference for the VLN research community.