通过多个段落的统一记忆回答生成

论文标题

通过多个段落的统一记忆回答生成

Answer Generation through Unified Memories over Multiple Passages

论文作者

Nakatsuji, Makoto, Okui, Sohei

论文摘要

通过参考多个段落来产生答案的机器阅读理解方法在AI和NLP社区中引起了很多关注。但是，当前的方法并未研究答案生成过程中多个段落之间的关系，即使段落之间的主题可能是答案候选者。我们的方法通过多个段落（GUM-MP）在统一的记忆中称为神经答案，如下所示。首先，它确定段落中的哪些令牌与问题匹配。特别是，它调查了在正面段落中的代币之间的匹配，这些段落分配给了问题，以及与问题无关的负面段落的匹配。接下来，它确定段落中的哪些令牌与分配给同一问题的其他段落匹配，同时它研究了它们匹配的主题。最后，它用以上两个匹配的结果编码令牌序列中的段落编码器中的统一记忆，并通过使用带有多个点生成器机制的编码器来学习答案序列。结果，口香糖MP可以通过指向跨段落中的重要令牌来生成答案。评估表明，口香糖MP比当前模型产生的结果要准确得多。

Machine reading comprehension methods that generate answers by referring to multiple passages for a question have gained much attention in AI and NLP communities. The current methods, however, do not investigate the relationships among multiple passages in the answer generation process, even though topics correlated among the passages may be answer candidates. Our method, called neural answer Generation through Unified Memories over Multiple Passages (GUM-MP), solves this problem as follows. First, it determines which tokens in the passages are matched to the question. In particular, it investigates matches between tokens in positive passages, which are assigned to the question, and those in negative passages, which are not related to the question. Next, it determines which tokens in the passage are matched to other passages assigned to the same question and at the same time it investigates the topics in which they are matched. Finally, it encodes the token sequences with the above two matching results into unified memories in the passage encoders and learns the answer sequence by using an encoder-decoder with a multiple-pointer-generator mechanism. As a result, GUM-MP can generate answers by pointing to important tokens present across passages. Evaluations indicate that GUM-MP generates much more accurate results than the current models do.

下载PDF全文

下载文献需遵守相关版权规定

论文标题