论文标题

基于过渡算法的固有依赖性位移偏置

Inherent Dependency Displacement Bias of Transition-Based Algorithms

论文作者

Anderson, Mark, Gómez-Rodríguez, Carlos

论文摘要

目前,各种基于过渡的算法用于依赖解析器。经验研究表明,在不同的树库中的性能各不相同,以至于一种算法在一个树库上优于另一种算法,而对于不同的树库来说,相反的情况是正确的。通常没有明显的理由导致一种算法更适合某个树库,而对另一棵树则更少。在本文中,我们通过介绍算法固有的依赖位移分布的概念来阐明这一点。这表征了算法在依赖性位移方面的偏差,从而量化了句法关系的距离和方向。我们表明,算法与树库位移分布的固有分布的相似性显然与该算法在该Treebank上的解析性能相关,特别是在通用依赖性树牛银行中的主要句子长度上具有非常显着和实质性的相关性。我们还获得了显示依赖性位移的更离散分析的结果不会导致任何有意义的相关性。

A wide variety of transition-based algorithms are currently used for dependency parsers. Empirical studies have shown that performance varies across different treebanks in such a way that one algorithm outperforms another on one treebank and the reverse is true for a different treebank. There is often no discernible reason for what causes one algorithm to be more suitable for a certain treebank and less so for another. In this paper we shed some light on this by introducing the concept of an algorithm's inherent dependency displacement distribution. This characterises the bias of the algorithm in terms of dependency displacement, which quantify both distance and direction of syntactic relations. We show that the similarity of an algorithm's inherent distribution to a treebank's displacement distribution is clearly correlated to the algorithm's parsing performance on that treebank, specifically with highly significant and substantial correlations for the predominant sentence lengths in Universal Dependency treebanks. We also obtain results which show a more discrete analysis of dependency displacement does not result in any meaningful correlations.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源