上下文主张的双关

论文标题

上下文主张的双关

Context-Situated Pun Generation

论文作者

Sun, Jiao, Narayan-Chen, Anjali, Oraby, Shereen, Gao, Shuyang, Chung, Tagyoung, Huang, Jing, Liu, Yang, Peng, Nanyun

论文摘要

以前关于PIN世代的工作通常始于一个给定的双关语词（异缘PIN产生的一对同音词，以及同构PIN代的多元词），并试图产生适当的双关语。尽管这可以使有效的双关语产生能力，但我们认为，如果双关语适当地适合给定的情况，例如给定情况或对话，则双关语最有趣。在这项工作中，我们提出了一项新任务，上下文定义的双关语生成，其中提供了一组关键字表示的特定上下文，其任务是首先识别适合上下文的合适的双关语词，然后根据上下文关键字和已识别的单词生成双关语。我们收集杯子（上下文位置的双关语），其中包含4.5K上下文单词和双关语。基于新数据和设置，我们建议使用上下文固定的双关语生成的管道系统，其中包括一个双关语单词检索模块，该模块可以标识给定上下文的合适的双关语单词，以及一个从上下文关键字和双关语中生成双关语的生成模块。人类评估表明，我们有69％的最高的双关语单词可用于生成上下文固定的双关语，而我们的一代模块在上下文单词和双关语中，成功的双关语31％的时间，几乎是最先进的双关语生成模型的产量三倍。通过端到端的评估，我们的管道系统具有前1位的双关语对，可以产生成功的双关语40％的时间，比所有其他建模变化都更好，但比人类成功率低32％。这突出了任务的困难，并鼓励朝这个方向进行更多的研究。

Previous work on pun generation commonly begins with a given pun word (a pair of homophones for heterographic pun generation and a polyseme for homographic pun generation) and seeks to generate an appropriate pun. While this may enable efficient pun generation, we believe that a pun is most entertaining if it fits appropriately within a given context, e.g., a given situation or dialogue. In this work, we propose a new task, context-situated pun generation, where a specific context represented by a set of keywords is provided, and the task is to first identify suitable pun words that are appropriate for the context, then generate puns based on the context keywords and the identified pun words. We collect CUP (Context-sitUated Pun), containing 4.5k tuples of context words and pun pairs. Based on the new data and setup, we propose a pipeline system for context-situated pun generation, including a pun word retrieval module that identifies suitable pun words for a given context, and a generation module that generates puns from context keywords and pun words. Human evaluation shows that 69% of our top retrieved pun words can be used to generate context-situated puns, and our generation module yields successful puns 31% of the time given a plausible tuple of context words and pun pair, almost tripling the yield of a state-of-the-art pun generation model. With an end-to-end evaluation, our pipeline system with the top-1 retrieved pun pair for a given context can generate successful puns 40% of the time, better than all other modeling variations but 32% lower than the human success rate. This highlights the difficulty of the task, and encourages more research in this direction.

下载PDF全文

下载文献需遵守相关版权规定

论文标题