论文标题
域术语如何影响汇总性能
How Domain Terminology Affects Meeting Summarization Performance
论文作者
论文摘要
会议对现代组织至关重要。每天举行和记录许多会议,超过所理解的更多。会议摘要系统可以确定从成绩单到自动生成会议记录的明显话语可能会有所帮助。它使用户能够快速搜索和筛选大型会议集合。迄今为止,尽管会议丰富,但域术语对开会摘要的表现的影响仍在研究。在本文中,我们在相当大的会议语料库中为域术语创建了金标准的注释;它们被称为行话术语。然后,我们分析有和没有行话术语的会议摘要系统的性能。我们的发现表明,域术语可能会对汇总性能产生重大影响。我们公开发布所有领域术语,以提高汇总研究的研究。
Meetings are essential to modern organizations. Numerous meetings are held and recorded daily, more than can ever be comprehended. A meeting summarization system that identifies salient utterances from the transcripts to automatically generate meeting minutes can help. It empowers users to rapidly search and sift through large meeting collections. To date, the impact of domain terminology on the performance of meeting summarization remains understudied, despite that meetings are rich with domain knowledge. In this paper, we create gold-standard annotations for domain terminology on a sizable meeting corpus; they are known as jargon terms. We then analyze the performance of a meeting summarization system with and without jargon terms. Our findings reveal that domain terminology can have a substantial impact on summarization performance. We publicly release all domain terminology to advance research in meeting summarization.