探测表示遗忘的监督和无监督的持续学习

论文标题

探测表示遗忘的监督和无监督的持续学习

Probing Representation Forgetting in Supervised and Unsupervised Continual Learning

论文作者

Davari, MohammadReza, Asadi, Nader, Mudur, Sudhir, Aljundi, Rahaf, Belilovsky, Eugene

论文摘要

持续学习研究通常着重于解决神经网络中灾难性遗忘的现象。灾难性遗忘与在任务或更广泛的数据分布（对变化训练的数据分布）中先前学到的知识的突然丧失有关。在监督的学习问题中，通常通过评估旧任务绩效的减少来衡量或观察到由模型表示的变化而导致的遗忘。但是，模型的表示可以改变而不会失去有关先前任务的知识。在这项工作中，我们考虑了表示遗忘的概念，通过在引入新任务之前和之后使用最佳线性分类器的性能差异来观察。使用此工具，我们重新审视了许多标准的持续学习基准测试，并观察到，通过此镜头，经过训练而没有任何明确控制的镜头表示，通常会忘记遗忘，有时可以与明确控制遗忘的方法相媲美，尤其是在更长的任务序列中。我们还表明，遗忘的表示会导致对持续学习中使用的模型容量和损失功能的影响的新见解。根据我们的结果，我们表明一种简单而竞争的方法是在查询旧样本时构造班级样本的原型时，通过标准监督的对比度学习不断学习表示形式。

Continual Learning research typically focuses on tackling the phenomenon of catastrophic forgetting in neural networks. Catastrophic forgetting is associated with an abrupt loss of knowledge previously learned by a model when the task, or more broadly the data distribution, being trained on changes. In supervised learning problems this forgetting, resulting from a change in the model's representation, is typically measured or observed by evaluating the decrease in old task performance. However, a model's representation can change without losing knowledge about prior tasks. In this work we consider the concept of representation forgetting, observed by using the difference in performance of an optimal linear classifier before and after a new task is introduced. Using this tool we revisit a number of standard continual learning benchmarks and observe that, through this lens, model representations trained without any explicit control for forgetting often experience small representation forgetting and can sometimes be comparable to methods which explicitly control for forgetting, especially in longer task sequences. We also show that representation forgetting can lead to new insights on the effect of model capacity and loss function used in continual learning. Based on our results, we show that a simple yet competitive approach is to learn representations continually with standard supervised contrastive learning while constructing prototypes of class samples when queried on old samples.

下载PDF全文

下载文献需遵守相关版权规定

论文标题