关于视觉运动控制的预培训：重新访问基线学习

论文标题

关于视觉运动控制的预培训：重新访问基线学习

On Pre-Training for Visuo-Motor Control: Revisiting a Learning-from-Scratch Baseline

论文作者

Hansen, Nicklas, Yuan, Zhecheng, Ze, Yanjie, Mu, Tongzhou, Rajeswaran, Aravind, Su, Hao, Xu, Huazhe, Wang, Xiaolong

论文摘要

在本文中，我们研究了预训练对视觉运动控制任务的有效性。 We revisit a simple Learning-from-Scratch (LfS) baseline that incorporates data augmentation and a shallow ConvNet, and find that this baseline is surprisingly competitive with recent approaches (PVR, MVP, R3M) that leverage frozen visual representations trained on large-scale vision datasets -- across a variety of algorithms, task domains, and metrics in simulation and on a real robot.我们的结果表明，这些方法受到预训练数据集和当前基准测试的明显域间隙的阻碍，用于视觉运动控制，这会通过芬特呼能缓解。根据我们的发现，我们为对控制的预培训进行了未来研究的建议，并希望我们简单而强大的基线将有助于准确基准基准在该领域的进步。

In this paper, we examine the effectiveness of pre-training for visuo-motor control tasks. We revisit a simple Learning-from-Scratch (LfS) baseline that incorporates data augmentation and a shallow ConvNet, and find that this baseline is surprisingly competitive with recent approaches (PVR, MVP, R3M) that leverage frozen visual representations trained on large-scale vision datasets -- across a variety of algorithms, task domains, and metrics in simulation and on a real robot. Our results demonstrate that these methods are hindered by a significant domain gap between the pre-training datasets and current benchmarks for visuo-motor control, which is alleviated by finetuning. Based on our findings, we provide recommendations for future research in pre-training for control and hope that our simple yet strong baseline will aid in accurately benchmarking progress in this area.

下载PDF全文

下载文献需遵守相关版权规定

论文标题