课程学习与后视经验重播，用于顺序对象操纵任务

论文标题

课程学习与后视经验重播，用于顺序对象操纵任务

Curriculum Learning with Hindsight Experience Replay for Sequential Object Manipulation Tasks

论文作者

Manela, Binyamin, Biess, Armin

论文摘要

从头开始学习复杂的任务是具有挑战性的，对于人类和人造代理来说通常是不可能的。可以使用课程，该课程将复杂的任务（目标任务）分解为一系列源任务（课程）。每个源任务都是下一个源任务的简化版本，复杂性增加。然后，学习通过对每个源任务进行培训逐渐发生，同时使用课程先前源任务中的知识。在这项研究中，我们提出了一种新算法，该算法将课程学习与事后的经验重播（她）相结合，以学习多个目标和稀疏反馈的顺序对象操纵任务。该算法利用了许多对象操纵任务中固有的复发结构，并在原始仿真中实现整个学习过程，而无需将其调整为每个源任务。我们已经在三个具有挑战性的投掷任务上测试了我们的算法，与Vanilla-HER相比，我们的算法有所改善。

Learning complex tasks from scratch is challenging and often impossible for humans as well as for artificial agents. A curriculum can be used instead, which decomposes a complex task (target task) into a sequence of source tasks (the curriculum). Each source task is a simplified version of the next source task with increasing complexity. Learning then occurs gradually by training on each source task while using knowledge from the curriculum's prior source tasks. In this study, we present a new algorithm that combines curriculum learning with Hindsight Experience Replay (HER), to learn sequential object manipulation tasks for multiple goals and sparse feedback. The algorithm exploits the recurrent structure inherent in many object manipulation tasks and implements the entire learning process in the original simulation without adjusting it to each source task. We have tested our algorithm on three challenging throwing tasks and show vast improvements compared to vanilla-HER.

下载PDF全文

下载文献需遵守相关版权规定

论文标题