论文标题
DLCFT:深线性持续微调,用于一般增量学习
DLCFT: Deep Linear Continual Fine-Tuning for General Incremental Learning
论文作者
论文摘要
预培训的表示是现代深度学习成功的关键要素之一。但是,现有的关于持续学习方法的作品主要集中在从头开始逐步学习学习模型。在本文中,我们探讨了一个替代框架,以逐步学习,在该框架中我们不断从预训练的表示中微调模型。我们的方法利用了预训练的神经网络的线性化技术来简单有效地学习。我们表明,这使我们能够设计一个线性模型,其中将二次参数正则化方法作为最佳连续学习策略,同时享受神经网络的高性能。我们还表明,所提出的算法使参数正则化方法适用于类型问题。此外,我们还提供了一个理论原因,为什么现有的参数空间正则化算法(例如EWC表现不佳的神经网络上的EWC)训练有跨凝性损失。我们表明,所提出的方法可以防止忘记,同时在图像分类任务上实现高持续的微调性能。为了证明我们的方法可以应用于一般的持续学习设置,我们评估了我们的数据收入,任务收入和课堂学习问题的方法。
Pre-trained representation is one of the key elements in the success of modern deep learning. However, existing works on continual learning methods have mostly focused on learning models incrementally from scratch. In this paper, we explore an alternative framework to incremental learning where we continually fine-tune the model from a pre-trained representation. Our method takes advantage of linearization technique of a pre-trained neural network for simple and effective continual learning. We show that this allows us to design a linear model where quadratic parameter regularization method is placed as the optimal continual learning policy, and at the same time enjoying the high performance of neural networks. We also show that the proposed algorithm enables parameter regularization methods to be applied to class-incremental problems. Additionally, we provide a theoretical reason why the existing parameter-space regularization algorithms such as EWC underperform on neural networks trained with cross-entropy loss. We show that the proposed method can prevent forgetting while achieving high continual fine-tuning performance on image classification tasks. To show that our method can be applied to general continual learning settings, we evaluate our method in data-incremental, task-incremental, and class-incremental learning problems.