论文标题
Ochadai在Semeval-2022任务2:多语言惯用性检测的对抗训练
OCHADAI at SemEval-2022 Task 2: Adversarial Training for Multilingual Idiomaticity Detection
论文作者
论文摘要
我们提出了一个多语言对抗训练模型,用于确定句子是否包含惯用表达式。鉴于该任务的关键挑战是注释数据的大小有限,我们的模型依赖于来自不同多语言的基于最先进的变压器的语言模型(即多语言Bert和XLM-Roberta)的预训练的上下文表示,以及对对抗性培训,一种进一步增强模型的概括和鲁棒性的培训方法。如果不依赖任何人的人力制作的功能,知识库或其他数据集以外的其他数据集,我们的模型获得了竞争成果,并在子任务A(零射击)设置中排名第六,在子任务A(一击)设置中排名第15位。
We propose a multilingual adversarial training model for determining whether a sentence contains an idiomatic expression. Given that a key challenge with this task is the limited size of annotated data, our model relies on pre-trained contextual representations from different multi-lingual state-of-the-art transformer-based language models (i.e., multilingual BERT and XLM-RoBERTa), and on adversarial training, a training method for further enhancing model generalization and robustness. Without relying on any human-crafted features, knowledge bases, or additional datasets other than the target datasets, our model achieved competitive results and ranked 6th place in SubTask A (zero-shot) setting and 15th place in SubTask A (one-shot) setting.