可稳定的可策划表示的共同信息最大化

论文标题

可稳定的可策划表示的共同信息最大化

Mutual Information Maximization for Robust Plannable Representations

论文作者

Ding, Yiming, Clavera, Ignasi, Abbeel, Pieter

论文摘要

将机器人技术的功能扩展到现实世界中的复杂，非结构化的环境需要开发更好的感知系统，同时保持较低的样本复杂性。在处理高维状态空间时，当前方法是基于重建目标的基于模型的或基于模型的。前者的样本效率低下是将其应用于现实世界的主要障碍。稍后，当他们提出较低的样本复杂性时，他们学习了需要重建场景每个细节的潜在空间。在实际环境中，任务通常仅代表场景的一小部分。在这种情况下，重建目标捕获了所有不必要的组成部分。在这项工作中，我们介绍了Miro，这是一种用于基于模型的强化学习的信息理论代表性学习算法。我们设计了一个潜在空间，该空间可以最大限度地利用未来信息，同时捕获计划所需的所有信息。我们表明，在有分心者和混乱的场景的情况下，我们的方法比重建目标更强大

Extending the capabilities of robotics to real-world complex, unstructured environments requires the need of developing better perception systems while maintaining low sample complexity. When dealing with high-dimensional state spaces, current methods are either model-free or model-based based on reconstruction objectives. The sample inefficiency of the former constitutes a major barrier for applying them to the real-world. The later, while they present low sample complexity, they learn latent spaces that need to reconstruct every single detail of the scene. In real environments, the task typically just represents a small fraction of the scene. Reconstruction objectives suffer in such scenarios as they capture all the unnecessary components. In this work, we present MIRO, an information theoretic representational learning algorithm for model-based reinforcement learning. We design a latent space that maximizes the mutual information with the future information while being able to capture all the information needed for planning. We show that our approach is more robust than reconstruction objectives in the presence of distractors and cluttered scenes

下载PDF全文

下载文献需遵守相关版权规定

论文标题