在空间中的机器人武器的自主轨迹学习的模仿学习

论文标题

在空间中的机器人武器的自主轨迹学习的模仿学习

Imitation Learning for Autonomous Trajectory Learning of Robot Arms in Space

论文作者

Shyam, RB Ashith, Hao, Zhou, Montanaro, Umberto, Neumann, Gerhard

论文摘要

这项工作加剧了为太空机器人提供更多自治的持续努力。在这里，通过演示或模仿学习的编程概念用于安装在小型航天器上的操纵器的轨迹计划。为了在未来的太空任务中进行更大的自主权和通过地面控制的最少人力干预，设想具有7度自由（DOF）的机器人部门可以通过执行诸如去除碎屑，轨道上的轨道服务和组装等多项任务。由于微重力环境的实际硬件实现非常昂贵，因此使用基于物理的模拟器中的模型预测控制器（MPC）生成了轨迹学习的演示数据。然后，数据通过概率运动原语（PROMP）对数据进行编码。这种离线轨迹学习可以更快地复制，还可以避免在太空环境中部署后的任何计算昂贵的优化。结果表明，概率分布可用于通过调节分布来生成以前看不见的情况的轨迹。机器人（或操纵器）手臂的运动会在航天器中诱导反应力，因此其态度会改变，促使态度确定和控制系统（ADC）采取大型纠正措施，将能量排除在系统之外。通过拥有冗余DOF的机器人臂有助于从相同的开始到同一目标找到几个可能的轨迹。这使得Promp轨迹发生器可以采样无障碍物的轨迹，并且具有最小的态度干扰，从而减少了ADC上的负载。

This work adds on to the on-going efforts to provide more autonomy to space robots. Here the concept of programming by demonstration or imitation learning is used for trajectory planning of manipulators mounted on small spacecraft. For greater autonomy in future space missions and minimal human intervention through ground control, a robot arm having 7-Degrees of Freedom (DoF) is envisaged for carrying out multiple tasks like debris removal, on-orbit servicing and assembly. Since actual hardware implementation of microgravity environment is extremely expensive, the demonstration data for trajectory learning is generated using a model predictive controller (MPC) in a physics based simulator. The data is then encoded compactly by Probabilistic Movement Primitives (ProMPs). This offline trajectory learning allows faster reproductions and also avoids any computationally expensive optimizations after deployment in a space environment. It is shown that the probabilistic distribution can be used to generate trajectories to previously unseen situations by conditioning the distribution. The motion of the robot (or manipulator) arm induces reaction forces on the spacecraft hub and hence its attitude changes prompting the Attitude Determination and Control System (ADCS) to take large corrective action that drains energy out of the system. By having a robot arm with redundant DoF helps in finding several possible trajectories from the same start to the same target. This allows the ProMP trajectory generator to sample out the trajectory which is obstacle free as well as having minimal attitudinal disturbances thereby reducing the load on ADCS.

下载PDF全文

下载文献需遵守相关版权规定

论文标题