论文标题

内存重播的有效性大规模持续学习

The Effectiveness of Memory Replay in Large Scale Continual Learning

论文作者

Balaji, Yogesh, Farajtabar, Mehrdad, Yin, Dong, Mott, Alex, Li, Ang

论文摘要

我们在大规模设置中研究持续学习,其中输入序列中的任务不限于分类,并且输出可以具有很高的维度。在多种最先进的方法中,我们发现香草体验重播(ER)在性能和可扩展性方面仍然非常有竞争力,尽管它很简单。但是,对于较小的记忆力,观察到ER的性能降解。特征空间的进一步可视化表明,中间表示会经历分布漂移。虽然现有方法通常仅重播输入输出对,但我们假设它们的正则化效果不足以进行复杂的深层模型和具有较小的重播缓冲液大小的不同任务。在此观察之后,我们建议除输入输出对外,还要重新重播中间层的激活。考虑到保存原始激活图可以大大增加内存和计算成本,我们提出了压缩激活重播技术,其中将层激活的压缩表示形式保存到重播缓冲液中。我们表明,这种方法可以在添加可忽略的内存开销中实现出色的正则效果。大规模任务基准的实验具有各种任务以及标准的通用数据集(分式cifar和Split-Miniimagenet)的实验,都证明了该方法的有效性。

We study continual learning in the large scale setting where tasks in the input sequence are not limited to classification, and the outputs can be of high dimension. Among multiple state-of-the-art methods, we found vanilla experience replay (ER) still very competitive in terms of both performance and scalability, despite its simplicity. However, a degraded performance is observed for ER with small memory. A further visualization of the feature space reveals that the intermediate representation undergoes a distributional drift. While existing methods usually replay only the input-output pairs, we hypothesize that their regularization effect is inadequate for complex deep models and diverse tasks with small replay buffer size. Following this observation, we propose to replay the activation of the intermediate layers in addition to the input-output pairs. Considering that saving raw activation maps can dramatically increase memory and compute cost, we propose the Compressed Activation Replay technique, where compressed representations of layer activation are saved to the replay buffer. We show that this approach can achieve superior regularization effect while adding negligible memory overhead to replay method. Experiments on both the large-scale Taskonomy benchmark with a diverse set of tasks and standard common datasets (Split-CIFAR and Split-miniImageNet) demonstrate the effectiveness of the proposed method.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源