机器人：一个开放的模拟到实体体现的AI平台

论文标题

机器人：一个开放的模拟到实体体现的AI平台

RoboTHOR: An Open Simulation-to-Real Embodied AI Platform

论文作者

Deitke, Matt, Han, Winson, Herrasti, Alvaro, Kembhavi, Aniruddha, Kolve, Eric, Mottaghi, Roozbeh, Salvador, Jordi, Schwenk, Dustin, VanderBilt, Eli, Wallingford, Matthew, Weihs, Luca, Yatskar, Mark, Farhadi, Ali

论文摘要

视觉识别生态系统（例如Imagenet，Pascal，Coco）无疑在现代计算机视觉的演变中发挥了主要作用。我们认为，在这些生态系统出现之前，交互式和体现的视觉AI已达到了类似于视觉识别的发展阶段。最近，引入了各种合成环境，以促进体现AI的研究。尽管取得了进展，但关于模拟训练的模型如何推广到现实的关键问题基本上仍未得到解答。建立一个可比的生态系统用于模拟对真实的AI体现的AI提出了许多挑战：（1）问题的固有互动性质，（2）需要在真实世界和模拟世界之间进行紧密对齐，（3）重复物理条件的难度用于重复实验，（4）以及相关的成本。在本文中，我们介绍了机器人，以使互动和具体视觉AI的研究民主化。机器人提供了一个模拟环境的框架，与物理对应物配对，以系统地探索和克服模拟传输的挑战，并在全球的研究人员可以在物理世界中远程测试其体现模型的平台。作为第一个基准，我们的实验表明，在模拟中对模拟中训练的模型的性能和精心构造的物理类似物进行了测试时，存在着显着差距。我们希望机器人将刺激体现计算机视觉的下一个进化阶段。可以通过以下链接访问机器人：https：//ai2thor.allenai.org/robothor

Visual recognition ecosystems (e.g. ImageNet, Pascal, COCO) have undeniably played a prevailing role in the evolution of modern computer vision. We argue that interactive and embodied visual AI has reached a stage of development similar to visual recognition prior to the advent of these ecosystems. Recently, various synthetic environments have been introduced to facilitate research in embodied AI. Notwithstanding this progress, the crucial question of how well models trained in simulation generalize to reality has remained largely unanswered. The creation of a comparable ecosystem for simulation-to-real embodied AI presents many challenges: (1) the inherently interactive nature of the problem, (2) the need for tight alignments between real and simulated worlds, (3) the difficulty of replicating physical conditions for repeatable experiments, (4) and the associated cost. In this paper, we introduce RoboTHOR to democratize research in interactive and embodied visual AI. RoboTHOR offers a framework of simulated environments paired with physical counterparts to systematically explore and overcome the challenges of simulation-to-real transfer, and a platform where researchers across the globe can remotely test their embodied models in the physical world. As a first benchmark, our experiments show there exists a significant gap between the performance of models trained in simulation when they are tested in both simulations and their carefully constructed physical analogs. We hope that RoboTHOR will spur the next stage of evolution in embodied computer vision. RoboTHOR can be accessed at the following link: https://ai2thor.allenai.org/robothor

下载PDF全文

下载文献需遵守相关版权规定

论文标题