论文标题

重新访问泄漏对依赖解析的影响

Revisiting the Effects of Leakage on Dependency Parsing

论文作者

Krasner, Nathaniel, Wanner, Miriam, Anastasopoulos, Antonios

论文摘要

Søgaard(2020年)的最新工作表明,除了训练图和测试图之间的重叠(称为泄漏)之间的重叠,与其他解释相比,依赖性解析性能的观察到的变化更多。在这项工作中,我们重新审视了这一主张,对更多模型和语言进行测试。我们发现它仅适用于零拍的跨语性设置。然后,我们提出了一种更细粒度的泄漏度量,与原始度量不同,不仅解释了,而且与观察到的性能变化相关。代码和数据可在此处找到:https://github.com/miriamwanner/reu-nlp-project

Recent work by Søgaard (2020) showed that, treebank size aside, overlap between training and test graphs (termed leakage) explains more of the observed variation in dependency parsing performance than other explanations. In this work we revisit this claim, testing it on more models and languages. We find that it only holds for zero-shot cross-lingual settings. We then propose a more fine-grained measure of such leakage which, unlike the original measure, not only explains but also correlates with observed performance variation. Code and data are available here: https://github.com/miriamwanner/reu-nlp-project

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源