校正的CBOW表现以及跳过

论文标题

校正的CBOW表现以及跳过

Corrected CBOW Performs as well as Skip-gram

论文作者

İrsoy, Ozan, Benton, Adrian, Stratos, Karl

论文摘要

Mikolov等。（2013a）观察到，连续的单词袋（CBOW）嵌入趋向于表现不佳的Skip-gram（SG）嵌入，并且在随后的工作中已经报道了这一发现。我们发现，这些观察结果不是由其培训目标的根本差异所驱动的，而是在官方实施，Word2Vec.c和Gensim等流行库中的负面采样CBOW实施错误的可能性。我们表明，在纠正了CBOW梯度更新中的错误后，可以学习CBOW单词嵌入，这些嵌入在各种固有和外部任务上与SG完全具有竞争力，同时多次训练。

Mikolov et al. (2013a) observed that continuous bag-of-words (CBOW) word embeddings tend to underperform Skip-gram (SG) embeddings, and this finding has been reported in subsequent works. We find that these observations are driven not by fundamental differences in their training objectives, but more likely on faulty negative sampling CBOW implementations in popular libraries such as the official implementation, word2vec.c, and Gensim. We show that after correcting a bug in the CBOW gradient update, one can learn CBOW word embeddings that are fully competitive with SG on various intrinsic and extrinsic tasks, while being many times faster to train.

下载PDF全文

下载文献需遵守相关版权规定

论文标题