论文标题

用于投影从英语到ewondo的命名实体的神经网络

Neurals Networks for Projecting Named Entities from English to Ewondo

论文作者

Mbouopda, Michael Franklin, Yonta, Paulin Melatagia, Lombo, Guy Stephane B. Fedim

论文摘要

指定的实体识别是自然语言处理中的重要任务。对于丰富的语言,它的研究经过了很好的研究,但仍未探索低资源语言。主要原因是现有技术需要大量注释的数据才能达到良好的性能。最近,已经提出了一种新的单词分布表示,将名为从丰富语言到低资源的项目的项目提出。该表示形式已与神经网络耦合,以便将名为“英语到ewondo”的实体项目(在喀麦隆语中使用的班图语言)。尽管所提出的方法取得了可观的结果,但与数据集的大小相比,使用的神经网络的大小太大了。此外,尚未研究模型参数的影响。在本文中,我们在实验上表明可以使用较小的神经网络获得相同的结果。我们还强调与网络性能高度相关的参数。这项工作是向前迈出的一步,是在低资源语言中构建可靠,强大的网络体系结构。

Named entity recognition is an important task in natural language processing. It is very well studied for rich language, but still under explored for low-resource languages. The main reason is that the existing techniques required a lot of annotated data to reach good performance. Recently, a new distributional representation of words has been proposed to project named entities from a rich language to a low-resource one. This representation has been coupled to a neural network in order to project named entities from English to Ewondo, a Bantu language spoken in Cameroon. Although the proposed method reached appreciable results, the size of the used neural network was too large compared to the size of the dataset. Furthermore the impact of the model parameters has not been studied. In this paper, we show experimentally that the same results can be obtained using a smaller neural network. We also emphasize the parameters that are highly correlated to the network performance. This work is a step forward to build a reliable and robust network architecture for named entity projection in low resource languages.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源