论文标题

通过基于模板的时间适应来学习动态上下文化的单词嵌入

Learning Dynamic Contextualised Word Embeddings via Template-based Temporal Adaptation

论文作者

Tang, Xiaohang, Zhou, Yi, Bollegala, Danushka

论文摘要

动态上下文化的单词嵌入(DCWES)代表单词的时间语义变化。我们提出了一种通过使用时间敏感的模板来调整预验证的蒙版语言模型(MLM)来学习DCWE的方法。给定两个快照$ c_1 $和$ c_2 $的语料库分别在两个不同的时间戳$ t_1 $和$ t_2 $上拍摄,我们首先提出了一种无人看息的方法,以选择(a)\ emph {pivot}术语与$ c_1 $和$ c_2 $和$ c_2 $ and(b)\ emph emph eNcript and and and ppiv at and and and ppiv compher}快照。然后,我们通过使用提取的枢轴和锚定项填充手动编译模板来生成提示。此外,我们提出了一种自动方法,可以从$ C_1 $和$ C_2 $中学习时间敏感的模板,而无需任何人类监督。接下来,我们使用生成的提示通过使用这些提示进行微调来调整预处理的MLM至$ T_2 $。多个实验表明,我们提出的方法在$ C_2 $中降低了测试句子的困惑,表现优于当前的最新时间。

Dynamic contextualised word embeddings (DCWEs) represent the temporal semantic variations of words. We propose a method for learning DCWEs by time-adapting a pretrained Masked Language Model (MLM) using time-sensitive templates. Given two snapshots $C_1$ and $C_2$ of a corpus taken respectively at two distinct timestamps $T_1$ and $T_2$, we first propose an unsupervised method to select (a) \emph{pivot} terms related to both $C_1$ and $C_2$, and (b) \emph{anchor} terms that are associated with a specific pivot term in each individual snapshot. We then generate prompts by filling manually compiled templates using the extracted pivot and anchor terms. Moreover, we propose an automatic method to learn time-sensitive templates from $C_1$ and $C_2$, without requiring any human supervision. Next, we use the generated prompts to adapt a pretrained MLM to $T_2$ by fine-tuning using those prompts. Multiple experiments show that our proposed method reduces the perplexity of test sentences in $C_2$, outperforming the current state-of-the-art.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源