使用基于政治句子的方法来汇总日本议会会议记录的话语，用于QA Lab-Poliinfo-2 NTCIR-15任务

论文标题

使用基于政治句子的方法来汇总日本议会会议记录的话语，用于QA Lab-Poliinfo-2 NTCIR-15任务

Summarizing Utterances from Japanese Assembly Minutes using Political Sentence-BERT-based Method for QA Lab-PoliInfo-2 Task of NTCIR-15

论文作者

Shirafuji, Daiki, Kameya, Hiromichi, Rzepka, Rafal, Araki, Kenji

论文摘要

在政治会议期间举行了许多讨论，其成绩单中包含许多有关各种话题的话语。如果我们想遵循对给定主题的意图或意见，我们需要阅读所有内容。为了避免这种昂贵且耗时的过程，以掌握漫长的讨论，NLP研究人员致力于产生简明的话语摘要。 NTCIR-15的QA Lab-Poliinfo-2任务中的摘要子任务解决了日本人在集会会议记录中的问题，我们的团队（SKRA）参加了此子任务。作为总结话语的第一步，我们创建了一个新的预训练的嵌入模式，即日本政治句子 - 伯伯特。使用此模型，我们总结了没有标记数据的话语。本文描述了我们解决任务并讨论其结果的方法。

There are many discussions held during political meetings, and a large number of utterances for various topics is included in their transcripts. We need to read all of them if we want to follow speakers\' intentions or opinions about a given topic. To avoid such a costly and time-consuming process to grasp often longish discussions, NLP researchers work on generating concise summaries of utterances. Summarization subtask in QA Lab-PoliInfo-2 task of the NTCIR-15 addresses this problem for Japanese utterances in assembly minutes, and our team (SKRA) participated in this subtask. As a first step for summarizing utterances, we created a new pre-trained sentence embedding model, i.e. the Japanese Political Sentence-BERT. With this model, we summarize utterances without labelled data. This paper describes our approach to solving the task and discusses its results.

下载PDF全文

下载文献需遵守相关版权规定

论文标题