无穷大：用于图形文本相互转换的简单而有效的无监督框架

论文标题

无穷大：用于图形文本相互转换的简单而有效的无监督框架

INFINITY: A Simple Yet Effective Unsupervised Framework for Graph-Text Mutual Conversion

论文作者

Xu, Yi, Fu, Luoyi, Lin, Zhouhan, Qi, Jiexing, Wang, Xinbing

论文摘要

图表到文本（G2T）生成和文本对图（T2G）三重提取是构造和应用知识图的两个必不可少的任务。由于避免使用图形文本并行数据，因此现有的无监督方法是合适的候选人，用于共同学习这两个任务。但是，它们由多个模块组成，仍然需要实体信息和培训过程中的关系类型。为此，我们提出了无限的无限，这是一种简单而有效的无监督方法，不需要外部注释工具或其他并行信息。它首次实现了完全无监督的图形互语转换。具体而言，Infinity仅通过微调一个预处理的SEQ2SEQ模型来将G2T和T2G视为双向序列生成任务。然后，设计出一种新型的基于反向翻译的框架，以自动生成连续的合成并行数据。为了获得来自源文本的结构信息的合理图表序列，通过利用奖励增强最大似然的优势来实现基于奖励的培训损失。作为一个完全无监督的框架，在经验上经过验证，对于G2T和T2G任务的最先进基准。

Graph-to-text (G2T) generation and text-to-graph (T2G) triple extraction are two essential tasks for constructing and applying knowledge graphs. Existing unsupervised approaches turn out to be suitable candidates for jointly learning the two tasks due to their avoidance of using graph-text parallel data. However, they are composed of multiple modules and still require both entity information and relation type in the training process. To this end, we propose INFINITY, a simple yet effective unsupervised approach that does not require external annotation tools or additional parallel information. It achieves fully unsupervised graph-text mutual conversion for the first time. Specifically, INFINITY treats both G2T and T2G as a bidirectional sequence generation task by fine-tuning only one pretrained seq2seq model. A novel back-translation-based framework is then designed to automatically generate continuous synthetic parallel data. To obtain reasonable graph sequences with structural information from source texts, INFINITY employs reward-based training loss by leveraging the advantage of reward augmented maximum likelihood. As a fully unsupervised framework, INFINITY is empirically verified to outperform state-of-the-art baselines for G2T and T2G tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题