基于深度学习的文本分类算法对实用输入扰动的敏感性

论文标题

基于深度学习的文本分类算法对实用输入扰动的敏感性

On Sensitivity of Deep Learning Based Text Classification Algorithms to Practical Input Perturbations

论文作者

Miyajiwala, Aamir, Ladkat, Arnav, Jagadale, Samiksha, Joshi, Raviraj

论文摘要

文本分类是一项基本的自然语言处理任务，具有多种应用，深度学习方法产生了最新的结果。尽管这些模型因其黑盒性质而受到严重批评，但它们对输入文本中轻微扰动的稳健性一直是一个令人关注的问题。在这项工作中，我们进行了一项以数据为中心的研究，该研究评估了系统实践扰动对基于深度学习的文本分类模型（例如CNN，LSTM和基于BERT的算法）的性能的影响。扰动是通过与标点符号和停止词（如模型的最终性能相关的标点符号和停止词）等不需要的令牌引起的。我们表明，包括BERT在内的这些深度学习方法对四个标准基准数据集SST2，TREC-6，BBC News和Tweet_eval的合法输入敏感。我们观察到，与添加令牌相比，BERT更容易受到去除令牌的影响。此外，与基于CNN的模型相比，LSTM对输入扰动略微敏感。这项工作还可以作为评估差异在火车测试条件中对模型最终性能的影响的实用指南。

Text classification is a fundamental Natural Language Processing task that has a wide variety of applications, where deep learning approaches have produced state-of-the-art results. While these models have been heavily criticized for their black-box nature, their robustness to slight perturbations in input text has been a matter of concern. In this work, we carry out a data-focused study evaluating the impact of systematic practical perturbations on the performance of the deep learning based text classification models like CNN, LSTM, and BERT-based algorithms. The perturbations are induced by the addition and removal of unwanted tokens like punctuation and stop-words that are minimally associated with the final performance of the model. We show that these deep learning approaches including BERT are sensitive to such legitimate input perturbations on four standard benchmark datasets SST2, TREC-6, BBC News, and tweet_eval. We observe that BERT is more susceptible to the removal of tokens as compared to the addition of tokens. Moreover, LSTM is slightly more sensitive to input perturbations as compared to CNN based model. The work also serves as a practical guide to assessing the impact of discrepancies in train-test conditions on the final performance of models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题