知识图融合语言模型微调

论文标题

知识图融合语言模型微调

Knowledge Graph Fusion for Language Model Fine-tuning

论文作者

Bhana, Nimesh, van Zyl, Terence L.

论文摘要

诸如BERT之类的语言模型由于能够预先训练并在各种自然语言处理任务上进行稳健性而变得越来越受欢迎。它们通常被视为对传统单词嵌入技术的进化，它们可以产生文本的语义表示，可用于诸如语义相似性之类的任务。但是，最新的模型通常具有很高的计算要求，并且缺乏全球环境或领域知识，这是完整语言理解所必需的。为了解决这些限制，我们研究了知识纳入伯特的微调阶段的好处。现有的k-bert模型通过知识图丰富了句子，它适用于英语，并扩展到将上下文相关的信息注入句子中。作为副作用，对K-Bert进行容纳英语的更改也扩展到其他基于单词的语言。进行的实验表明，注入知识会引入噪声。当将这种噪声最小化时，我们看到知识驱动的任务的统计显着改善。我们表明的证据表明，鉴于适当的任务，具有相关，高质量知识的适度注射是最表现的。

Language Models such as BERT have grown in popularity due to their ability to be pre-trained and perform robustly on a wide range of Natural Language Processing tasks. Often seen as an evolution over traditional word embedding techniques, they can produce semantic representations of text, useful for tasks such as semantic similarity. However, state-of-the-art models often have high computational requirements and lack global context or domain knowledge which is required for complete language understanding. To address these limitations, we investigate the benefits of knowledge incorporation into the fine-tuning stages of BERT. An existing K-BERT model, which enriches sentences with triplets from a Knowledge Graph, is adapted for the English language and extended to inject contextually relevant information into sentences. As a side-effect, changes made to K-BERT for accommodating the English language also extend to other word-based languages. Experiments conducted indicate that injected knowledge introduces noise. We see statistically significant improvements for knowledge-driven tasks when this noise is minimised. We show evidence that, given the appropriate task, modest injection with relevant, high-quality knowledge is most performant.

下载PDF全文

下载文献需遵守相关版权规定

论文标题