论文标题
空气:推理能力的关注
AiR: Attention with Reasoning Capability
论文作者
论文摘要
尽管注意力是在深层神经网络中越来越流行的组成部分,以解释和促进模型的性能,但很少有工作研究了注意力如何完成任务以及是否合理。在这项工作中,我们提出了一个以推理能力(AIR)框架的关注,该框架利用注意力来理解和改进导致任务结果的过程。我们首先根据一系列原子推理操作来定义评估度量,从而实现了考虑推理过程的定量测量。然后,我们收集人类的眼睛跟踪并回答正确性数据,并分析各种机器和人类的推理能力以及它们如何影响任务绩效。此外,我们提出了一种监督方法,以共同和逐步优化注意力,推理和任务绩效,以便模型学会通过遵循推理过程来查看利益区域。我们证明了拟议框架在分析和建模注意力方面具有更好的推理能力和任务绩效的有效性。代码和数据可在https://github.com/szzexpoi/Air上获得
While attention has been an increasingly popular component in deep neural networks to both interpret and boost performance of models, little work has examined how attention progresses to accomplish a task and whether it is reasonable. In this work, we propose an Attention with Reasoning capability (AiR) framework that uses attention to understand and improve the process leading to task outcomes. We first define an evaluation metric based on a sequence of atomic reasoning operations, enabling quantitative measurement of attention that considers the reasoning process. We then collect human eye-tracking and answer correctness data, and analyze various machine and human attentions on their reasoning capability and how they impact task performance. Furthermore, we propose a supervision method to jointly and progressively optimize attention, reasoning, and task performance so that models learn to look at regions of interests by following a reasoning process. We demonstrate the effectiveness of the proposed framework in analyzing and modeling attention with better reasoning capability and task performance. The code and data are available at https://github.com/szzexpoi/AiR