论文标题

通过随机网络中的Hebbian可塑性进行元学习

Meta-Learning through Hebbian Plasticity in Random Networks

论文作者

Najarro, Elias, Risi, Sebastian

论文摘要

终身学习和适应性是生物学剂的两个定义方面。现代加强学习(RL)方法在解决复杂的任务方面已经显示出很大的进步,但是一旦得出了培训,发现的解决方案通常是静态的,并且无法适应新信息或扰动。尽管仍然尚未完全了解生物学大脑如何从经验中如此有效地学习和适应,但人们认为突触可塑性在这一过程中起着重要的作用。受这种生物学机制的启发,我们提出了一种搜索方法,该方法不是直接优化神经网络的重量参数,而仅搜索突触特异性的HEBBIAN学习规则,该规则使网络能够在代理人一生中不断地自组织其体重。我们在几个具有不同感官方式和超过450K可训练的可塑性参数的强化学习任务上展示了我们的方法。我们发现,从完全随机的权重开始,发现的HEBBIAN规则使代理可以浏览动态的2D像素环境。同样,它们允许模拟的3D四倍体机器人学习如何在训练期间看不到形态损害,并且在不到100个时间段的任何明确的奖励或错误信号的情况下都看不到形态损害。代码可在https://github.com/enajx/hebbianmetalarning上找到。

Lifelong learning and adaptability are two defining aspects of biological agents. Modern reinforcement learning (RL) approaches have shown significant progress in solving complex tasks, however once training is concluded, the found solutions are typically static and incapable of adapting to new information or perturbations. While it is still not completely understood how biological brains learn and adapt so efficiently from experience, it is believed that synaptic plasticity plays a prominent role in this process. Inspired by this biological mechanism, we propose a search method that, instead of optimizing the weight parameters of neural networks directly, only searches for synapse-specific Hebbian learning rules that allow the network to continuously self-organize its weights during the lifetime of the agent. We demonstrate our approach on several reinforcement learning tasks with different sensory modalities and more than 450K trainable plasticity parameters. We find that starting from completely random weights, the discovered Hebbian rules enable an agent to navigate a dynamical 2D-pixel environment; likewise they allow a simulated 3D quadrupedal robot to learn how to walk while adapting to morphological damage not seen during training and in the absence of any explicit reward or error signal in less than 100 timesteps. Code is available at https://github.com/enajx/HebbianMetaLearning.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源