meta-x $ _ {nlg} $：一种基于零拍的语言群的元学习方法

论文标题

meta-x $ _ {nlg} $：一种基于零拍的语言群的元学习方法

Meta-X$_{NLG}$: A Meta-Learning Approach Based on Language Clustering for Zero-Shot Cross-Lingual Transfer and Generation

论文作者

Maurya, Kaushal Kumar, Desarkar, Maunendra Sankar

论文摘要

最近，NLP社区见证了多语言和跨语性转移研究的迅速发展，在该研究中，监督从高资源语言（HRLS）转移到低资源语言（LRLS）。但是，跨语性转移在语言之间并不统一，尤其是在零弹性设置中。为了实现这一目标，一个有希望的研究方向是学习具有有限的注释数据的多个任务的可共享结构。下游的多语言应用程序可能会受益于这样的学习设置，因为全球大多数语言都是低资源，并且与其他语言共享一些结构。在本文中，我们提出了一个新颖的元学习框架（称为meta-x $ _ {nlg} $），以根据元学习和语言群集从类型上多样化的语言中学习可共享的结构。这是朝着看不见的语言统一的跨语言转移的一步。我们首先根据语言表示将语言聚集，并确定每个集群的质心语言。然后，对元学习算法进行了所有质心语言训练，并在零拍设置中对其他语言进行了评估。我们证明了这种建模对两个NLG任务（抽象文本摘要和问题产生），5个流行数据集和30种类型上多样的语言的有效性。对强基线的一致改进证明了所提出的框架的功效。该模型的仔细设计使此端到端的NLG设置不那么容易受到意外翻译问题的影响，这在零射击的跨语言NLG任务中是一个突出的关注点。

Recently, the NLP community has witnessed a rapid advancement in multilingual and cross-lingual transfer research where the supervision is transferred from high-resource languages (HRLs) to low-resource languages (LRLs). However, the cross-lingual transfer is not uniform across languages, particularly in the zero-shot setting. Towards this goal, one promising research direction is to learn shareable structures across multiple tasks with limited annotated data. The downstream multilingual applications may benefit from such a learning setup as most of the languages across the globe are low-resource and share some structures with other languages. In this paper, we propose a novel meta-learning framework (called Meta-X$_{NLG}$) to learn shareable structures from typologically diverse languages based on meta-learning and language clustering. This is a step towards uniform cross-lingual transfer for unseen languages. We first cluster the languages based on language representations and identify the centroid language of each cluster. Then, a meta-learning algorithm is trained with all centroid languages and evaluated on the other languages in the zero-shot setting. We demonstrate the effectiveness of this modeling on two NLG tasks (Abstractive Text Summarization and Question Generation), 5 popular datasets and 30 typologically diverse languages. Consistent improvements over strong baselines demonstrate the efficacy of the proposed framework. The careful design of the model makes this end-to-end NLG setup less vulnerable to the accidental translation problem, which is a prominent concern in zero-shot cross-lingual NLG tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题