论文标题
使用通用依赖性的跨语性适应
Cross-Lingual Adaptation Using Universal Dependencies
论文作者
论文摘要
我们描述了一种基于从通用依赖项(UD)获得的句法解析树(UD)的跨语性适应方法,该方法是一致的,该方法是一致的,以低资源语言开发分类器。 UD解析的想法是在类型上不同的语言中捕获相似之处以及特质。在本文中,我们表明,使用UD解析树训练了用于复杂的NLP任务的模型可以表征非常不同的语言。我们研究了术语识别和语义关系提取的两项任务作为案例研究。基于UD解析树,我们使用树核开发了几种模型,并表明这些在英语数据集上训练的模型可以正确地对其他语言的数据进行分类,例如法语,波尔西和阿拉伯语。所提出的方法为解决类似的跨语义任务而开辟了利用UD解析的途径,这对于没有标记数据的语言非常有用。
We describe a cross-lingual adaptation method based on syntactic parse trees obtained from the Universal Dependencies (UD), which are consistent across languages, to develop classifiers in low-resource languages. The idea of UD parsing is to capture similarities as well as idiosyncrasies among typologically different languages. In this paper, we show that models trained using UD parse trees for complex NLP tasks can characterize very different languages. We study two tasks of paraphrase identification and semantic relation extraction as case studies. Based on UD parse trees, we develop several models using tree kernels and show that these models trained on the English dataset can correctly classify data of other languages e.g. French, Farsi, and Arabic. The proposed approach opens up avenues for exploiting UD parsing in solving similar cross-lingual tasks, which is very useful for languages that no labeled data is available for them.