注意力遇到扰动：对抗训练的强大而可解释的关注

论文标题

注意力遇到扰动：对抗训练的强大而可解释的关注

Attention Meets Perturbations: Robust and Interpretable Attention with Adversarial Training

论文作者

Kitada, Shunsuke, Iyatomi, Hitoshi

论文摘要

尽管注意力机制已应用于各种深度学习模型，并已被证明可以改善预测性能，但据报道，它容易受到该机制的影响。为了克服机制扰动的脆弱性，我们受到对抗训练（AT）的启发，这是一种强大的正则化技术，可增强模型的鲁棒性。在本文中，我们提出了一种针对自然语言处理任务的一般培训技术，包括注意（注意）和更容易引起注意（注意）。提出的技术通过使用AT利用机制来提高预测性能和模型的解释性。特别是，注意力通过引入对抗性扰动来提高这些优势，从而增强了句子注意力的差异。使用十个开放数据集的评估实验表明，对于注意机制，尤其是注意力AIT，（1）（1）十分之九的任务中的最佳表现以及（2）所有任务的更容易解释的注意力（即，结果注意力与基于梯度的单词的重要性更加强烈）。此外，提出的技术（3）依赖于AT的扰动大小。我们的代码可在https://github.com/shunk031/astenthe-meets-perturbation上找到

Although attention mechanisms have been applied to a variety of deep learning models and have been shown to improve the prediction performance, it has been reported to be vulnerable to perturbations to the mechanism. To overcome the vulnerability to perturbations in the mechanism, we are inspired by adversarial training (AT), which is a powerful regularization technique for enhancing the robustness of the models. In this paper, we propose a general training technique for natural language processing tasks, including AT for attention (Attention AT) and more interpretable AT for attention (Attention iAT). The proposed techniques improved the prediction performance and the model interpretability by exploiting the mechanisms with AT. In particular, Attention iAT boosts those advantages by introducing adversarial perturbation, which enhances the difference in the attention of the sentences. Evaluation experiments with ten open datasets revealed that AT for attention mechanisms, especially Attention iAT, demonstrated (1) the best performance in nine out of ten tasks and (2) more interpretable attention (i.e., the resulting attention correlated more strongly with gradient-based word importance) for all tasks. Additionally, the proposed techniques are (3) much less dependent on perturbation size in AT. Our code is available at https://github.com/shunk031/attention-meets-perturbation

下载PDF全文

下载文献需遵守相关版权规定

论文标题