论文标题
深度学习的人工神经变异性:过度拟合,噪音记忆和灾难性遗忘
Artificial Neural Variability for Deep Learning: On Overfitting, Noise Memorization, and Catastrophic Forgetting
论文作者
论文摘要
深度学习通常受到自然神经系统中很少存在的两个严重问题的批评:过度拟合和灾难性的遗忘。它甚至可以记住随机标记的数据,该数据在实例标签对背后几乎没有知识。当深层网络通过容纳新任务不断学习时,它通常会迅速覆盖从以前的任务中学到的知识。被称为{\ it神经变异性},在神经科学中众所周知,即使对同一刺激的响应,人脑反应也会显示出很大的可变性。这种机制平衡了自然神经系统运动学习中的精度和可塑性/灵活性。因此,它激发了我们设计一种名为{\ it人工神经变异性}(ANV)的类似机制,该机制有助于人工神经网络从``自然''神经网络中学习一些优势。我们严格地证明,ANV是培训数据和学习模型之间相互信息的隐式正规化程序。从理论上讲,这一结果可确保ANV严格提高了概括性,对标签噪声的鲁棒性以及对灾难性遗忘的鲁棒性。然后,我们设计一个{\ it神经变量最小化}(NVRM)框架和{\ it神经变量优化器},以实现实践中常规网络体系结构的ANV。实证研究表明,NVRM可以有效缓解过度拟合,标签噪声记忆以及灾难性的遗忘,而遗忘的成本可以忽略不计。 \ footNote {代码:\ url {https://github.com/zeke-xie/artcover-neural-variability-for-deep-learning}。
Deep learning is often criticized by two serious issues which rarely exist in natural nervous systems: overfitting and catastrophic forgetting. It can even memorize randomly labelled data, which has little knowledge behind the instance-label pairs. When a deep network continually learns over time by accommodating new tasks, it usually quickly overwrites the knowledge learned from previous tasks. Referred to as the {\it neural variability}, it is well-known in neuroscience that human brain reactions exhibit substantial variability even in response to the same stimulus. This mechanism balances accuracy and plasticity/flexibility in the motor learning of natural nervous systems. Thus it motivates us to design a similar mechanism named {\it artificial neural variability} (ANV), which helps artificial neural networks learn some advantages from ``natural'' neural networks. We rigorously prove that ANV plays as an implicit regularizer of the mutual information between the training data and the learned model. This result theoretically guarantees ANV a strictly improved generalizability, robustness to label noise, and robustness to catastrophic forgetting. We then devise a {\it neural variable risk minimization} (NVRM) framework and {\it neural variable optimizers} to achieve ANV for conventional network architectures in practice. The empirical studies demonstrate that NVRM can effectively relieve overfitting, label noise memorization, and catastrophic forgetting at negligible costs. \footnote{Code: \url{https://github.com/zeke-xie/artificial-neural-variability-for-deep-learning}.