多语言神经第一话语解析

论文标题

多语言神经第一话语解析

Multilingual Neural RST Discourse Parsing

论文作者

Liu, Zhengyuan, Shi, Ke, Chen, Nancy F.

论文摘要

文本话语解析在理解自然语言中的信息流和论证结构中起着重要作用。在修辞结构理论（RST）下的先前研究主要集中于诱导和评估英国树库的模型。但是，由于带注释的数据短缺，其他语言（例如德语，荷兰语和葡萄牙语）的解析任务仍然具有挑战性。在这项工作中，我们通过以下方式研究了两种建立神经，跨语性话语解析器的方法：（1）利用多语言矢量表示；（2）采用源内容的细分级翻译。实验结果表明，即使培训数据有限，这两种方法都是有效的，并且在所有子任务上都在跨语言，文档级话语解析方面实现最先进的绩效。

Text discourse parsing plays an important role in understanding information flow and argumentative structure in natural language. Previous research under the Rhetorical Structure Theory (RST) has mostly focused on inducing and evaluating models from the English treebank. However, the parsing tasks for other languages such as German, Dutch, and Portuguese are still challenging due to the shortage of annotated data. In this work, we investigate two approaches to establish a neural, cross-lingual discourse parser via: (1) utilizing multilingual vector representations; and (2) adopting segment-level translation of the source content. Experiment results show that both methods are effective even with limited training data, and achieve state-of-the-art performance on cross-lingual, document-level discourse parsing on all sub-tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题