从专家演示中学习稳定的机器人中的稳定演习

论文标题

从专家演示中学习稳定的机器人中的稳定演习

Learning Stable Manoeuvres in Quadruped Robots from Expert Demonstrations

论文作者

Tirumala, Sashank, Gubbi, Sagar, Paigwar, Kartik, Sagi, Aditya, Joglekar, Ashish, Bhatnagar, Shalabh, Ghosal, Ashitava, Amrutur, Bharadwaj, Kolathaya, Shishir

论文摘要

随着针对四足机器人采取步伐的开发的研究，正在探索基于学习的技术，以开发为此类机器人开发运动控制器。一个关键问题是以稳定的方式生成腿部轨迹，以连续改变目标线性和角速度。在本文中，我们提出了一种解决这个问题的两种统计方法。首先，对多个简单的策略进行了训练，以生成一组离散的目标速度和转弯半径的轨迹。然后，使用更高级别的神经网络来增强这些策略，以处理学习轨迹之间的过渡。具体而言，我们开发了一个基于神经网络的滤波器，该过滤器会占据目标速度，半径并将其转换为新命令，从而可以平稳过渡到新轨迹。通过从专家演示中学习来实现这种转变。这样的应用是将新手用户的输入转换为专家用户的输入，从而确保稳定的操作，无论用户的体验如何。与标准的神经网络体系结构相比，培训我们提出的架构需要更少的专家演示。最后，我们在内部四倍的第2个基座中通过实验证明了这些结果。

With the research into development of quadruped robots picking up pace, learning based techniques are being explored for developing locomotion controllers for such robots. A key problem is to generate leg trajectories for continuously varying target linear and angular velocities, in a stable manner. In this paper, we propose a two pronged approach to address this problem. First, multiple simpler policies are trained to generate trajectories for a discrete set of target velocities and turning radius. These policies are then augmented using a higher level neural network for handling the transition between the learned trajectories. Specifically, we develop a neural network-based filter that takes in target velocity, radius and transforms them into new commands that enable smooth transitions to the new trajectory. This transformation is achieved by learning from expert demonstrations. An application of this is the transformation of a novice user's input into an expert user's input, thereby ensuring stable manoeuvres regardless of the user's experience. Training our proposed architecture requires much less expert demonstrations compared to standard neural network architectures. Finally, we demonstrate experimentally these results in the in-house quadruped Stoch 2.

下载PDF全文

下载文献需遵守相关版权规定

论文标题