论文标题
Semglove:Bert手套的语义共同出现
SemGloVe: Semantic Co-occurrences for GloVe from BERT
论文作者
论文摘要
手套通过利用单词共存在矩阵的统计信息来学习单词嵌入。但是,矩阵中的单词对是从预定义的本地上下文窗口中提取的,这可能会导致单词对和潜在的语义无关的单词对。在本文中,我们提出了semglove,该semglove将语义共同出现从伯特提取到静态手套单词嵌入。特别是,我们提出了两个模型,以基于蒙版语言模型或BERT的多头注意权重提取共发生统计。我们的方法可以提取单词对而不限制本地窗口假设,并且可以通过直接考虑单词对之间的语义距离来定义共发生权重。几个单词相似性数据集和四个外部任务的实验表明,Semglove可以胜过手套。
GloVe learns word embeddings by leveraging statistical information from word co-occurrence matrices. However, word pairs in the matrices are extracted from a predefined local context window, which might lead to limited word pairs and potentially semantic irrelevant word pairs. In this paper, we propose SemGloVe, which distills semantic co-occurrences from BERT into static GloVe word embeddings. Particularly, we propose two models to extract co-occurrence statistics based on either the masked language model or the multi-head attention weights of BERT. Our methods can extract word pairs without limiting by the local window assumption and can define the co-occurrence weights by directly considering the semantic distance between word pairs. Experiments on several word similarity datasets and four external tasks show that SemGloVe can outperform GloVe.