脚步限制的双足动态步行的模拟学习学习

论文标题

脚步限制的双足动态步行的模拟学习学习

Sim-to-Real Learning of Footstep-Constrained Bipedal Dynamic Walking

论文作者

Duan, Helei, Malik, Ashish, Dao, Jeremy, Saxena, Aseem, Green, Kevin, Siekmann, Jonah, Fern, Alan, Hurst, Jonathan

论文摘要

最近，用于两足机器人的增强学习（RL）的工作已成功地学习了各种动态步态的控制器，并进行了强大的SIM到现实演示。为了维持平衡，学识渊博的控制器具有完全自由地放置脚的自由，从而产生了高度健壮的步态。然而，在现实世界中，环境通常会对通常通过感知系统确定的可行脚步位置施加限制。不幸的是，大多数在两足机器人上证明的RL控制器不允许指定和响应此类约束。这种缺失的控制接口极大地限制了当前RL控制器的现实应用。在本文中，我们旨在保持学习步态的稳健和动态性质，同时尊重外部施加的脚步约束。我们为训练动态步态控制器开发了RL公式，可以响应指定的达阵位置。然后，我们成功地在双足机器人Cassie上成功展示了模拟和模拟性能。此外，我们使用监督的学习来诱导过渡模型，以准确预测机器人的本体感受观察值，控制器可以实现下一个达阵位置。该模型为将学习的控制器集成到全阶机器人运动计划器中铺平了道路，该计划者强大地满足平衡和环境约束。

Recently, work on reinforcement learning (RL) for bipedal robots has successfully learned controllers for a variety of dynamic gaits with robust sim-to-real demonstrations. In order to maintain balance, the learned controllers have full freedom of where to place the feet, resulting in highly robust gaits. In the real world however, the environment will often impose constraints on the feasible footstep locations, typically identified by perception systems. Unfortunately, most demonstrated RL controllers on bipedal robots do not allow for specifying and responding to such constraints. This missing control interface greatly limits the real-world application of current RL controllers. In this paper, we aim to maintain the robust and dynamic nature of learned gaits while also respecting footstep constraints imposed externally. We develop an RL formulation for training dynamic gait controllers that can respond to specified touchdown locations. We then successfully demonstrate simulation and sim-to-real performance on the bipedal robot Cassie. In addition, we use supervised learning to induce a transition model for accurately predicting the next touchdown locations that the controller can achieve given the robot's proprioceptive observations. This model paves the way for integrating the learned controller into a full-order robot locomotion planner that robustly satisfies both balance and environmental constraints.

下载PDF全文

下载文献需遵守相关版权规定

论文标题