论文标题
降级预训练和数据增强策略,以增强变形金刚的RDF语言
Denoising Pre-Training and Data Augmentation Strategies for Enhanced RDF Verbalization with Transformers
论文作者
论文摘要
由于知识库无处不在(KB),RDF Triples言语的任务已经知道受欢迎程度的增长。 RDF三元组的形式主义是一种大规模存储事实的简单有效方法。但是,它的抽象表示使人难以解释。为此,WebNLG挑战旨在促进自动化的RDF到文本生成。我们建议使用数据增强策略利用变压器模型从增强数据中利用预培训。我们的实验结果表明,在标准培训中,BLEU得分的最小相对增长分别为3.73%,126.05%和88.16%的BLEU得分,看不见的实体和看不见的类别。
The task of verbalization of RDF triples has known a growth in popularity due to the rising ubiquity of Knowledge Bases (KBs). The formalism of RDF triples is a simple and efficient way to store facts at a large scale. However, its abstract representation makes it difficult for humans to interpret. For this purpose, the WebNLG challenge aims at promoting automated RDF-to-text generation. We propose to leverage pre-trainings from augmented data with the Transformer model using a data augmentation strategy. Our experiment results show a minimum relative increases of 3.73%, 126.05% and 88.16% in BLEU score for seen categories, unseen entities and unseen categories respectively over the standard training.