论文标题
复杂单词识别的跨语性转移学习
Cross-Lingual Transfer Learning for Complex Word Identification
论文作者
论文摘要
复杂的单词识别(CWI)是一项任务,集中于在不同专业领域的文本中检测难以理解的单词或单词组。 CWI的目的是突出有问题的结构,这些结构通常很难理解。我们的方法使用零射击,一声和少量学习技术,以及最新的自然语言处理解决方案(NLP)任务(即变形金刚)。我们的目的是提供证据表明,拟议的模型可以通过依靠可用于四种不同语言的CWI共享任务2018数据集(即英语,德语,西班牙语和法语)来学习多种语言环境中复杂单词的特征。对于零拍学习方案,我们的方法在英语(0.774),德语(0.782)和西班牙语(0.734)语言方面超过了最先进的跨语性结果。同时,我们的模型还胜过德语(0.795宏观F1得分)的最新单语结果。
Complex Word Identification (CWI) is a task centered on detecting hard-to-understand words, or groups of words, in texts from different areas of expertise. The purpose of CWI is to highlight problematic structures that non-native speakers would usually find difficult to understand. Our approach uses zero-shot, one-shot, and few-shot learning techniques, alongside state-of-the-art solutions for Natural Language Processing (NLP) tasks (i.e., Transformers). Our aim is to provide evidence that the proposed models can learn the characteristics of complex words in a multilingual environment by relying on the CWI shared task 2018 dataset available for four different languages (i.e., English, German, Spanish, and also French). Our approach surpasses state-of-the-art cross-lingual results in terms of macro F1-score on English (0.774), German (0.782), and Spanish (0.734) languages, for the zero-shot learning scenario. At the same time, our model also outperforms the state-of-the-art monolingual result for German (0.795 macro F1-score).