通过加强学习将行动策略正规化以平稳控制

论文标题

通过加强学习将行动策略正规化以平稳控制

Regularizing Action Policies for Smooth Control with Reinforcement Learning

论文作者

Mysore, Siddharth, Mabsout, Bassel, Mancuso, Renato, Saenko, Kate

论文摘要

经过深入增强学习（RL）训练的控制器的实际实用性的一个关键问题是RL政策所学的行动缺乏平稳性。这种趋势通常以控制信号振荡的形式表现出来，并可能导致控制不良，功耗高和系统磨损。我们介绍了行动策略平滑度调节（CAPS），这是一种有效但直观的正规化行动策略，可在消除控制信号中消除高频组件的神经网络控制器的良好状态映射的平稳性方面保持一致的改善。在实际系统上进行了测试，在四型无人机上的控制器平滑度的改善导致功耗近80％，同时始终如一地训练值得飞行的控制器。项目网站：http：//ai.bu.edu/caps

A critical problem with the practical utility of controllers trained with deep Reinforcement Learning (RL) is the notable lack of smoothness in the actions learned by the RL policies. This trend often presents itself in the form of control signal oscillation and can result in poor control, high power consumption, and undue system wear. We introduce Conditioning for Action Policy Smoothness (CAPS), an effective yet intuitive regularization on action policies, which offers consistent improvement in the smoothness of the learned state-to-action mappings of neural network controllers, reflected in the elimination of high-frequency components in the control signal. Tested on a real system, improvements in controller smoothness on a quadrotor drone resulted in an almost 80% reduction in power consumption while consistently training flight-worthy controllers. Project website: http://ai.bu.edu/caps

下载PDF全文

下载文献需遵守相关版权规定

论文标题