遵循人群的智慧：通过最低贝叶斯风险解码有效的文本生成

论文标题

遵循人群的智慧：通过最低贝叶斯风险解码有效的文本生成

Follow the Wisdom of the Crowd: Effective Text Generation via Minimum Bayes Risk Decoding

论文作者

Suzgun, Mirac, Melas-Kyriazi, Luke, Jurafsky, Dan

论文摘要

在开放式的自然语言一代中，现有的文本解码方法通常很难产生既多样化又高质量的文本。已知贪婪和梁搜索会遭受文本退化和语言多样性问题的困扰，而温度，TOP-K和核采样通常会产生不同但质量低的输出。在这项工作中，我们介绍了人群采样，这是一个基于贝叶斯风险最小化的解码方法的家族，以解决这种多样性质量的权衡。受“人群的智慧”原则的启发，人群采样试图从一批候选人中选择一名候选人，这些候选人根据给定的公用事业功能，在生成模型下具有最低期望的风险（即最高预期奖励）。人群采样可以看作是许多现有方法的概括，包括多数投票，实际上，它可以用作现有采样方法的替换。广泛的实验表明，人群采样可在各种任务中改善3-7个胭脂和BLEU点，包括摘要，数据之间，翻译，翻译和文本样式转移，同时实现WebNLG和WMT'16的新最先进的结果。

In open-ended natural-language generation, existing text decoding methods typically struggle to produce text which is both diverse and high-quality. Greedy and beam search are known to suffer from text degeneration and linguistic diversity issues, while temperature, top-k, and nucleus sampling often yield diverse but low-quality outputs. In this work, we present crowd sampling, a family of decoding methods based on Bayesian risk minimization, to address this diversity-quality trade-off. Inspired by the principle of "the wisdom of the crowd," crowd sampling seeks to select a candidate from a pool of candidates that has the least expected risk (i.e., highest expected reward) under a generative model according to a given utility function. Crowd sampling can be seen as a generalization of numerous existing methods, including majority voting, and in practice, it can be used as a drop-in replacement for existing sampling methods. Extensive experiments show that crowd sampling delivers improvements of 3-7 ROUGE and BLEU points across a wide range of tasks, including summarization, data-to-text, translation, and textual style transfer, while achieving new state-of-the-art results on WebNLG and WMT'16.

下载PDF全文

下载文献需遵守相关版权规定

论文标题