班级的持续学习到扩展的der-verse

论文标题

班级的持续学习到扩展的der-verse

Class-Incremental Continual Learning into the eXtended DER-verse

论文作者

Boschini, Matteo, Bonicelli, Lorenzo, Buzzega, Pietro, Porrello, Angelo, Calderara, Simone

论文摘要

人类智力的主食是以连续的方式获取知识的能力。与之形成鲜明对比的是，深层网络在灾难性上忘记了，由于这个原因，班级知识的持续学习的子场培养了逐步学习任务顺序的方法，将依次获得的知识融合到全面的预测中。这项工作旨在评估和克服我们以前的提案黑暗体验重播（DER）的陷阱，这是一种简单有效的方法，结合了排练和知识蒸馏。受我们的思想不断改写过去的回忆并设定对未来期望的方式的启发，我们将模型赋予了能力i）修改其重播记忆，以欢迎有关过去数据的新颖信息ii）为学习却看不见的课程铺平了道路。我们表明，这些策略的应用导致了显着的改进；实际上，所得的方法（称为扩展器（X-der））在标准基准（例如CIFAR-100和Miniimagenet）上优于最新技术，并在此处引入了新颖的基准。为了获得更好的理解，我们进一步提供了广泛的消融研究，以证实和扩展我们先前研究的发现（例如，在持续学习设置中，知识蒸馏和夸张的最小值的价值）。

The staple of human intelligence is the capability of acquiring knowledge in a continuous fashion. In stark contrast, Deep Networks forget catastrophically and, for this reason, the sub-field of Class-Incremental Continual Learning fosters methods that learn a sequence of tasks incrementally, blending sequentially-gained knowledge into a comprehensive prediction. This work aims at assessing and overcoming the pitfalls of our previous proposal Dark Experience Replay (DER), a simple and effective approach that combines rehearsal and Knowledge Distillation. Inspired by the way our minds constantly rewrite past recollections and set expectations for the future, we endow our model with the abilities to i) revise its replay memory to welcome novel information regarding past data ii) pave the way for learning yet unseen classes. We show that the application of these strategies leads to remarkable improvements; indeed, the resulting method - termed eXtended-DER (X-DER) - outperforms the state of the art on both standard benchmarks (such as CIFAR-100 and miniImagenet) and a novel one here introduced. To gain a better understanding, we further provide extensive ablation studies that corroborate and extend the findings of our previous research (e.g. the value of Knowledge Distillation and flatter minima in continual learning setups).

下载PDF全文

下载文献需遵守相关版权规定

论文标题