论文标题

本地结构在大多数语言中最重要

Local Structure Matters Most in Most Languages

论文作者

Clouâtre, Louis, Parthasarathi, Prasanna, Zouaq, Amal, Chandar, Sarath

论文摘要

许多最近的扰动研究发现,在英语中执行自然语言理解(NLU)任务时,什么事情和无关紧要。编码属性(例如单词的顺序)通常可以通过洗牌而不会影响下游性能来删除。这种见解可用于将未来的研究引向英语NLP模型。由于多语言设置的许多改进包括包括英语方法的批发适应,因此验证这些研究是否在多语言环境中复制很重要。在这项工作中,我们在多语言环境中复制了一项研究局部结构的重要性以及全球结构的相对不重要的研究。我们发现,在英语上观察到的现象广泛地转化为120多种语言,并有一些警告。

Many recent perturbation studies have found unintuitive results on what does and does not matter when performing Natural Language Understanding (NLU) tasks in English. Coding properties, such as the order of words, can often be removed through shuffling without impacting downstream performances. Such insight may be used to direct future research into English NLP models. As many improvements in multilingual settings consist of wholesale adaptation of English approaches, it is important to verify whether those studies replicate or not in multilingual settings. In this work, we replicate a study on the importance of local structure, and the relative unimportance of global structure, in a multilingual setting. We find that the phenomenon observed on the English language broadly translates to over 120 languages, with a few caveats.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源