CookDial：以任务为导向的对话框的数据集建立在程序文档中

论文标题

CookDial：以任务为导向的对话框的数据集建立在程序文档中

CookDial: A dataset for task-oriented dialogs grounded in procedural documents

论文作者

Jiang, Yiwei, Zaporojets, Klim, Deleu, Johannes, Demeester, Thomas, Develder, Chris

论文摘要

这项工作提出了一个新的对话数据集，即cookdial，该数据集促进了以过程知识了解的面向任务的对话系统的研究。该语料库包含260个以人类对任务为导向的对话框，其中代理给出了配方文档，指导用户烹饪菜肴。 Cookdial中的对话框展示了两个独特的功能：（i）对话流与支持文档之间的程序对齐；（ii）复杂的代理决策涉及分割长句子，释义硬说明并在对话框上下文中解决核心。此外，我们在假定的面向任务的对话框系统中确定了三个具有挑战性的（子）任务：（1）用户问题理解，（2）代理行动框架预测，以及（3）代理响应生成。对于这些任务中的每一个，我们都会开发一个神经基线模型，我们在cookdial数据集上进行了评估。我们公开发布了烹饪数据集，其中包括对话框和食谱文档的丰富注释，以刺激对特定领域的文档接地对话系统的进一步研究。

This work presents a new dialog dataset, CookDial, that facilitates research on task-oriented dialog systems with procedural knowledge understanding. The corpus contains 260 human-to-human task-oriented dialogs in which an agent, given a recipe document, guides the user to cook a dish. Dialogs in CookDial exhibit two unique features: (i) procedural alignment between the dialog flow and supporting document; (ii) complex agent decision-making that involves segmenting long sentences, paraphrasing hard instructions and resolving coreference in the dialog context. In addition, we identify three challenging (sub)tasks in the assumed task-oriented dialog system: (1) User Question Understanding, (2) Agent Action Frame Prediction, and (3) Agent Response Generation. For each of these tasks, we develop a neural baseline model, which we evaluate on the CookDial dataset. We publicly release the CookDial dataset, comprising rich annotations of both dialogs and recipe documents, to stimulate further research on domain-specific document-grounded dialog systems.

下载PDF全文

下载文献需遵守相关版权规定

论文标题