论文标题
FlowVal:基于共识的对话评估框架使用段Act Flow
FlowEval: A Consensus-Based Dialogue Evaluation Framework Using Segment Act Flows
论文作者
论文摘要
尽管开放域对话评估最近取得了进展,但如何开发自动指标仍然是一个开放的问题。我们探讨了对话评估的潜力,其中包含对话ACT信息,该信息几乎没有在以前的方法中明确建模。但是,在总体上定义的对话行为是粗略的,因为话语可以包含具有不同功能的多个部分。因此,我们提出了细分法,将对话法从话语级别延伸到细分级别,并为其提供了一个大规模数据集。为了利用片段行为流,段ACT的序列进行评估,我们开发了第一个基于共识的对话评估框架FlowVal。该框架通过查找伪参考提供了无参考评估的无参考方法。针对三个基准数据集上强基础的广泛实验证明了我们的流动序列的有效性和其他理想特征,并指出了更好的对话评估的潜在途径。
Despite recent progress in open-domain dialogue evaluation, how to develop automatic metrics remains an open problem. We explore the potential of dialogue evaluation featuring dialog act information, which was hardly explicitly modeled in previous methods. However, defined at the utterance level in general, dialog act is of coarse granularity, as an utterance can contain multiple segments possessing different functions. Hence, we propose segment act, an extension of dialog act from utterance level to segment level, and crowdsource a large-scale dataset for it. To utilize segment act flows, sequences of segment acts, for evaluation, we develop the first consensus-based dialogue evaluation framework, FlowEval. This framework provides a reference-free approach for dialog evaluation by finding pseudo-references. Extensive experiments against strong baselines on three benchmark datasets demonstrate the effectiveness and other desirable characteristics of our FlowEval, pointing out a potential path for better dialogue evaluation.