诊断在类增量学习中批次归一化

论文标题

诊断在类增量学习中批次归一化

Diagnosing Batch Normalization in Class Incremental Learning

论文作者

Zhou, Minghao, Wang, Quanziang, Shu, Jun, Zhao, Qian, Meng, Deyu

论文摘要

广泛的研究已在类增量学习（类IL）中应用深度神经网络（DNN）。作为DNN的构建基块，批处理（BN）标准化了中间特征图，并已得到广泛验证以提高训练稳定性和收敛性。但是，我们声称在IL类模型中直接使用标准BN对代表学习和分类器培训都有害，从而加剧了灾难性的遗忘。在本文中，我们通过说明这种BN困境来研究BN对IL类模型的影响。我们进一步提出了BN技巧来通过训练更好的特征提取器，同时消除分类偏见来解决问题。在不邀请额外的超参数邀请的情况下，我们将BN技巧应用于三个基线彩排的方法，即ER，DER ++和ICARL。通过在SEQ-CIFAR-10，SEQ-CIFAR-100和SEQ-TINY-IMAGENET的基准数据集上进行的全面实验，我们表明，BN技巧可以为所有采用的基线带来显着的性能增长，从而揭示其沿这一研究的潜在通用性。

Extensive researches have applied deep neural networks (DNNs) in class incremental learning (Class-IL). As building blocks of DNNs, batch normalization (BN) standardizes intermediate feature maps and has been widely validated to improve training stability and convergence. However, we claim that the direct use of standard BN in Class-IL models is harmful to both the representation learning and the classifier training, thus exacerbating catastrophic forgetting. In this paper we investigate the influence of BN on Class-IL models by illustrating such BN dilemma. We further propose BN Tricks to address the issue by training a better feature extractor while eliminating classification bias. Without inviting extra hyperparameters, we apply BN Tricks to three baseline rehearsal-based methods, ER, DER++ and iCaRL. Through comprehensive experiments conducted on benchmark datasets of Seq-CIFAR-10, Seq-CIFAR-100 and Seq-Tiny-ImageNet, we show that BN Tricks can bring significant performance gains to all adopted baselines, revealing its potential generality along this line of research.

下载PDF全文

下载文献需遵守相关版权规定

论文标题