基于查询的摘要的有条件自我注意

论文标题

基于查询的摘要的有条件自我注意

Conditional Self-Attention for Query-based Summarization

论文作者

Xie, Yujia, Zhou, Tianyi, Mao, Yi, Chen, Weizhu

论文摘要

自我注意力的机制在各种NLP任务上取得了巨大的成功，因为它的灵活性是在序列中捕获任意位置之间的依赖性。对于基于查询的摘要（QSUMM）和知识图推理等问题，每个输入序列与额外的查询相关联，明确地对这种条件上下文依赖性进行建模可能会导致更准确的解决方案，但是，现有的自我关注机制无法捕获。在本文中，我们提出了\ textit {条件自我注意力}（CSA），这是一种设计用于条件依赖性建模的神经网络模块。 CSA通过在自我发项模块中的输入令牌之间调整成对注意力，其输入分数与给定查询的匹配分数。因此，由CSA建模的上下文依赖项将与查询高度相关。我们进一步研究了由不同类型的注意力定义的CSA变体。关于DebatePedia和HotPotQA基准数据集的实验表明，CSA始终优于QSUMM问题的Vanilla Transformer和以前的模型。

Self-attention mechanisms have achieved great success on a variety of NLP tasks due to its flexibility of capturing dependency between arbitrary positions in a sequence. For problems such as query-based summarization (Qsumm) and knowledge graph reasoning where each input sequence is associated with an extra query, explicitly modeling such conditional contextual dependencies can lead to a more accurate solution, which however cannot be captured by existing self-attention mechanisms. In this paper, we propose \textit{conditional self-attention} (CSA), a neural network module designed for conditional dependency modeling. CSA works by adjusting the pairwise attention between input tokens in a self-attention module with the matching score of the inputs to the given query. Thereby, the contextual dependencies modeled by CSA will be highly relevant to the query. We further studied variants of CSA defined by different types of attention. Experiments on Debatepedia and HotpotQA benchmark datasets show CSA consistently outperforms vanilla Transformer and previous models for the Qsumm problem.

下载PDF全文

下载文献需遵守相关版权规定

论文标题