NLP记忆的经验研究

论文标题

NLP记忆的经验研究

An Empirical Study of Memorization in NLP

论文作者

Zheng, Xiaosen, Jiang, Jing

论文摘要

费尔德曼（Feldman）（2020）最近的一项研究提出了一种长尾理论，以解释深度学习模型的记忆行为。但是，在NLP的背景下，记忆尚未得到经验验证，这是这项工作解决的差距。在本文中，我们使用三个不同的NLP任务来检查长尾理论是否存在。我们的实验表明，排名最高的记忆训练实例可能是非典型的，并且与随机删除训练实例相比，取消最重要的训练实例会导致测试准确性下降。此外，我们开发了一种归因方法，以更好地了解为什么训练实例被记住。我们从经验上表明，我们的记忆归因方法是忠实的，并分享了我们有趣的发现，即训练实例的顶级部分往往与班级标签呈负相关的特征。

A recent study by Feldman (2020) proposed a long-tail theory to explain the memorization behavior of deep learning models. However, memorization has not been empirically verified in the context of NLP, a gap addressed by this work. In this paper, we use three different NLP tasks to check if the long-tail theory holds. Our experiments demonstrate that top-ranked memorized training instances are likely atypical, and removing the top-memorized training instances leads to a more serious drop in test accuracy compared with removing training instances randomly. Furthermore, we develop an attribution method to better understand why a training instance is memorized. We empirically show that our memorization attribution method is faithful, and share our interesting finding that the top-memorized parts of a training instance tend to be features negatively correlated with the class label.

下载PDF全文

下载文献需遵守相关版权规定

论文标题