论文标题
单词类灵活性:一种深层的上下文化方法
Word class flexibility: A deep contextualized approach
论文作者
论文摘要
单词类灵活性是指在不同语法类别中使用单个单词形式的现象。语言类型学领域的广泛工作已试图表征跨语言的单词类灵活性,但是准确地量化了这种现象,并且大规模量化了困难。我们提出了一种原则性的方法,以探索单词类灵活性的规律性。我们的方法基于上下文化的单词嵌入中的最新工作,以量化单词类之间的语义转移(例如名词到动词,动词到名词),我们将此方法应用于37种语言。我们发现,上下文化的嵌入不仅捕获了英语单词中阶级变化的人类判断,而且还发现了跨语言的阶级灵活性的共同倾向。具体而言,当在其主要的单词类中使用柔性引理时,我们会发现更大的语义变化,这支持了单词类灵活性是一个定向过程的观点。我们的工作突出了语言类型学中深层情境化模型的实用性。
Word class flexibility refers to the phenomenon whereby a single word form is used across different grammatical categories. Extensive work in linguistic typology has sought to characterize word class flexibility across languages, but quantifying this phenomenon accurately and at scale has been fraught with difficulties. We propose a principled methodology to explore regularity in word class flexibility. Our method builds on recent work in contextualized word embeddings to quantify semantic shift between word classes (e.g., noun-to-verb, verb-to-noun), and we apply this method to 37 languages. We find that contextualized embeddings not only capture human judgment of class variation within words in English, but also uncover shared tendencies in class flexibility across languages. Specifically, we find greater semantic variation when flexible lemmas are used in their dominant word class, supporting the view that word class flexibility is a directional process. Our work highlights the utility of deep contextualized models in linguistic typology.