论文标题
一台机器上基于人群的快速增强学习
Fast Population-Based Reinforcement Learning on a Single Machine
论文作者
论文摘要
训练人群在加强训练,改善探索和渐近性能以及产生各种解决方案方面表现出了巨大的希望。但是,从业人员通常不考虑基于人群的培训,因为它被认为是速度缓慢(依次实施),或者在计算上昂贵(如果代理人是在独立加速器上并行培训)。在这项工作中,我们比较了实施和重新审视以前的研究,以表明对汇编和矢量化的明智使用允许与训练单个代理相比,在单台机器上进行基于人群的培训。我们还表明,当提供一些加速器时,我们的协议扩展到诸如高参数调谐等应用的较大人口大小。我们希望这项工作和公众发布我们的代码将鼓励从业者更频繁地使用基于人群的学习来进行研究和应用。
Training populations of agents has demonstrated great promise in Reinforcement Learning for stabilizing training, improving exploration and asymptotic performance, and generating a diverse set of solutions. However, population-based training is often not considered by practitioners as it is perceived to be either prohibitively slow (when implemented sequentially), or computationally expensive (if agents are trained in parallel on independent accelerators). In this work, we compare implementations and revisit previous studies to show that the judicious use of compilation and vectorization allows population-based training to be performed on a single machine with one accelerator with minimal overhead compared to training a single agent. We also show that, when provided with a few accelerators, our protocols extend to large population sizes for applications such as hyperparameter tuning. We hope that this work and the public release of our code will encourage practitioners to use population-based learning more frequently for their research and applications.