论文标题

使用潜在叙事结构的剧本摘要

Screenplay Summarization Using Latent Narrative Structure

论文作者

Papalampidi, Pinelopi, Keller, Frank, Frermann, Lea, Lapata, Mirella

论文摘要

大多数通用的提取性摘要模型都经过新闻文章的培训,这些新闻文章简短并提前提供了所有重要信息。结果,此类模型在位置上有偏见,并经常从文档开头执行巧妙的句子选择。当总结长期叙事(具有复杂的结构并零散的信息)时,简单的启发式启发式是不够的。在本文中,我们建议将叙事的基本结构明确地纳入一般无监督和监督的提取性摘要模型中。我们以关键叙事事件(转折点)的形式对叙事结构进行形式化,并将其视为潜在的,以总结剧本(即提取最佳场景序列)。我们使用场景级别摘要标签增强的电视剧本CSI语料库的实验结果表明,潜在的转折点与CSI发作的重要方面相关,并改善了对一般的提取算法的摘要性能,从而导致更完整和更多样化的摘要。

Most general-purpose extractive summarization models are trained on news articles, which are short and present all important information upfront. As a result, such models are biased on position and often perform a smart selection of sentences from the beginning of the document. When summarizing long narratives, which have complex structure and present information piecemeal, simple position heuristics are not sufficient. In this paper, we propose to explicitly incorporate the underlying structure of narratives into general unsupervised and supervised extractive summarization models. We formalize narrative structure in terms of key narrative events (turning points) and treat it as latent in order to summarize screenplays (i.e., extract an optimal sequence of scenes). Experimental results on the CSI corpus of TV screenplays, which we augment with scene-level summarization labels, show that latent turning points correlate with important aspects of a CSI episode and improve summarization performance over general extractive algorithms leading to more complete and diverse summaries.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源