论文标题

FlowVal:基于共识的对话评估框架使用段Act Flow

FlowEval: A Consensus-Based Dialogue Evaluation Framework Using Segment Act Flows

论文作者

Zhao, Jianqiao, Li, Yanyang, Du, Wanyu, Ji, Yangfeng, Yu, Dong, Lyu, Michael R., Wang, Liwei

论文摘要

尽管开放域对话评估最近取得了进展,但如何开发自动指标仍然是一个开放的问题。我们探讨了对话评估的潜力,其中包含对话ACT信息,该信息几乎没有在以前的方法中明确建模。但是,在总体上定义的对话行为是粗略的,因为话语可以包含具有不同功能的多个部分。因此,我们提出了细分法,将对话法从话语级别延伸到细分级别,并为其提供了一个大规模数据集。为了利用片段行为流,段ACT的序列进行评估,我们开发了第一个基于共识的对话评估框架FlowVal。该框架通过查找伪参考提供了无参考评估的无参考方法。针对三个基准数据集上强基础的广泛实验证明了我们的流动序列的有效性和其他理想特征,并指出了更好的对话评估的潜在途径。

Despite recent progress in open-domain dialogue evaluation, how to develop automatic metrics remains an open problem. We explore the potential of dialogue evaluation featuring dialog act information, which was hardly explicitly modeled in previous methods. However, defined at the utterance level in general, dialog act is of coarse granularity, as an utterance can contain multiple segments possessing different functions. Hence, we propose segment act, an extension of dialog act from utterance level to segment level, and crowdsource a large-scale dataset for it. To utilize segment act flows, sequences of segment acts, for evaluation, we develop the first consensus-based dialogue evaluation framework, FlowEval. This framework provides a reference-free approach for dialog evaluation by finding pseudo-references. Extensive experiments against strong baselines on three benchmark datasets demonstrate the effectiveness and other desirable characteristics of our FlowEval, pointing out a potential path for better dialogue evaluation.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源