论文标题
模型或603个示例:朝着记忆效率的课堂学习学习
A Model or 603 Exemplars: Towards Memory-Efficient Class-Incremental Learning
论文作者
论文摘要
现实世界的应用程序要求分类模型适应新类而不会忘记旧类。相应地,班级学习(CIL)旨在培训具有有限记忆大小的模型以满足此要求。典型的CIL方法倾向于从以前的类别中节省代表性的示例以抵制遗忘,而最近的作品发现,存储来自历史的模型可以大大提高性能。但是,存储的模型未计入内存预算,这隐含导致不公平的比较。我们发现,当将模型大小计入总预算中并比较以对齐的内存大小比较的方法时,保存模型并不能持续起作用,尤其是对于有限的内存预算的情况。结果,我们需要在不同的记忆尺度上整体评估不同的CIL方法,并同时考虑用于测量的准确性和记忆尺寸。另一方面,我们深入研究了记忆缓冲区的构建,以提高内存效率。通过分析网络中不同层的影响,我们发现浅层和深层在CIL中具有不同的特征。在此激励的情况下,我们提出了一个简单而有效的基线,表示为记忆有效扩展模型的备忘录。备忘录根据共享的广义表示扩展了专业层,以适度的成本有效提取不同的表示形式并维持代表性的示例。基准数据集的广泛实验验证了备忘录的竞争性能。代码可在以下网址找到:https://github.com/wangkiw/iclr23-memo
Real-world applications require the classification model to adapt to new classes without forgetting old ones. Correspondingly, Class-Incremental Learning (CIL) aims to train a model with limited memory size to meet this requirement. Typical CIL methods tend to save representative exemplars from former classes to resist forgetting, while recent works find that storing models from history can substantially boost the performance. However, the stored models are not counted into the memory budget, which implicitly results in unfair comparisons. We find that when counting the model size into the total budget and comparing methods with aligned memory size, saving models do not consistently work, especially for the case with limited memory budgets. As a result, we need to holistically evaluate different CIL methods at different memory scales and simultaneously consider accuracy and memory size for measurement. On the other hand, we dive deeply into the construction of the memory buffer for memory efficiency. By analyzing the effect of different layers in the network, we find that shallow and deep layers have different characteristics in CIL. Motivated by this, we propose a simple yet effective baseline, denoted as MEMO for Memory-efficient Expandable MOdel. MEMO extends specialized layers based on the shared generalized representations, efficiently extracting diverse representations with modest cost and maintaining representative exemplars. Extensive experiments on benchmark datasets validate MEMO's competitive performance. Code is available at: https://github.com/wangkiw/ICLR23-MEMO