COQAR：COQA上的问题重写

论文标题

COQAR：COQA上的问题重写

CoQAR: Question Rewriting on CoQA

论文作者

Brabant, Quentin, Lecorve, Gwenole, Rojas-Barahona, Lina M.

论文摘要

人类在对话中提出的问题通常包含上下文依赖性，即对先前对话转弯的明确或隐式引用。这些依赖性采用核心发行的形式（例如，通过代词使用）或椭圆形，可以使自动化系统的理解难以理解。促进对问题的理解和后续治疗方法的一种方法是将其重写为不受欢迎的形式，即可以理解的形式而没有对话性上下文。我们提出了Coqar，该语料库，其中包含$ 4.5 $ K的对话中的对话询问数据集COQA，总计$ 53 $ K的后续提问 - 答案对。每个原始的问题都在至少2个脱离外观的重写中手动注释。 Coqar可以用于监督三个任务的监督：问题释义，问题重写和对话性问题回答。为了评估Coqar重写的质量，我们进行了几项实验，其中包括培训和评估这三个任务的模型。我们的结果支持以下想法：问题重写可以用作问题回答模型的预处理步骤，从而提高其性能。

Questions asked by humans during a conversation often contain contextual dependencies, i.e., explicit or implicit references to previous dialogue turns. These dependencies take the form of coreferences (e.g., via pronoun use) or ellipses, and can make the understanding difficult for automated systems. One way to facilitate the understanding and subsequent treatments of a question is to rewrite it into an out-of-context form, i.e., a form that can be understood without the conversational context. We propose CoQAR, a corpus containing $4.5$K conversations from the Conversational Question-Answering dataset CoQA, for a total of $53$K follow-up question-answer pairs. Each original question was manually annotated with at least 2 at most 3 out-of-context rewritings. CoQAR can be used in the supervised learning of three tasks: question paraphrasing, question rewriting and conversational question answering. In order to assess the quality of CoQAR's rewritings, we conduct several experiments consisting in training and evaluating models for these three tasks. Our results support the idea that question rewriting can be used as a preprocessing step for question answering models, thereby increasing their performances.

下载PDF全文

下载文献需遵守相关版权规定

论文标题