论文标题

混合LMC:通过合奏深钢筋学习对车轮人体机器人的混合学习和基于模型的控制

Hybrid LMC: Hybrid Learning and Model-based Control for Wheeled Humanoid Robot via Ensemble Deep Reinforcement Learning

论文作者

Baek, Donghoon, Purushottam, Amartya, Ramos, Joao

论文摘要

由于这些机器人的非线性动力学和不足的特征,对车轮类人动力的控制是一个具有挑战性的问题。传统上,反馈控制器已用于稳定和运动。但是,这些方法通常受到所使用的基本模型,控制器的选择以及所考虑的环境变量(表面类型,地面倾斜等)的限制。加强学习的最新进展(RL)提供了有前途的方法来解决其中一些常规的反馈控制器问题,但需要大量的交互数据才能学习。在这里,我们提出了一个混合学习和基于模型的控制器混合LMC,该混合动力LMC结合了经典的线性二次调节器(LQR)和集合深度强化学习的优势。整体深钢筋学习由多个软参与者 - 批评者(SAC)组成,并用于降低RL网络的方差。通过在同步使用反馈控制器,网络在培训的早期阶段表现出稳定的性能。作为初步步骤,我们探讨了混合LMC在控制Mujoco Simulator中一组不同物理参数上控制人形机器人的车轮运动方面的生存能力。我们的结果表明,与其他现有技术相比,混合LMC的性能更好,并提高了样品效率。

Control of wheeled humanoid locomotion is a challenging problem due to the nonlinear dynamics and under-actuated characteristics of these robots. Traditionally, feedback controllers have been utilized for stabilization and locomotion. However, these methods are often limited by the fidelity of the underlying model used, choice of controller, and environmental variables considered (surface type, ground inclination, etc). Recent advances in reinforcement learning (RL) offer promising methods to tackle some of these conventional feedback controller issues, but require large amounts of interaction data to learn. Here, we propose a hybrid learning and model-based controller Hybrid LMC that combines the strengths of a classical linear quadratic regulator (LQR) and ensemble deep reinforcement learning. Ensemble deep reinforcement learning is composed of multiple Soft Actor-Critic (SAC) and is utilized in reducing the variance of RL networks. By using a feedback controller in tandem the network exhibits stable performance in the early stages of training. As a preliminary step, we explore the viability of Hybrid LMC in controlling wheeled locomotion of a humanoid robot over a set of different physical parameters in MuJoCo simulator. Our results show that Hybrid LMC achieves better performance compared to other existing techniques and has increased sample efficiency

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源