论文标题

挤压转移的PMI矩阵:桥接单词嵌入和双曲线空间

Squashed Shifted PMI Matrix: Bridging Word Embeddings and Hyperbolic Spaces

论文作者

Assylbekov, Zhenisbek, Jangeldin, Alibi

论文摘要

我们表明,以负抽样(SGNS)为单位的skip-gram中的sigmoid变换不会显着损害单词向量的质量,同时又与分配挤压移动的pMI矩阵有关,而后者可以将其视为随机图的连接概率矩阵。从经验上讲,这种图是一个复杂的网络,即它具有强大的聚类和无尺度的分布,并且与双曲线空间紧密相连。简而言之,我们使用分析和经验方法通过壁板移动的PMI基质来显示静态单词嵌入和双曲线空间之间的联系。

We show that removing sigmoid transformation in the skip-gram with negative sampling (SGNS) objective does not harm the quality of word vectors significantly and at the same time is related to factorizing a squashed shifted PMI matrix which, in turn, can be treated as a connection probabilities matrix of a random graph. Empirically, such graph is a complex network, i.e. it has strong clustering and scale-free degree distribution, and is tightly connected with hyperbolic spaces. In short, we show the connection between static word embeddings and hyperbolic spaces through the squashed shifted PMI matrix using analytical and empirical methods.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源