论文标题

神经深泡检测与文本的事实结构

Neural Deepfake Detection with Factual Structure of Text

论文作者

Zhong, Wanjun, Tang, Duyu, Xu, Zenan, Wang, Ruize, Duan, Nan, Zhou, Ming, Wang, Jiahai, Yin, Jian

论文摘要

DeepFake检测是自动区分机器生成的文本的任务,随着自然语言生成模型的最新进展,越来越重要。现有的DeepFake检测方法通常表示具有粗粒表示的文档。但是,根据我们的统计分析,他们努力捕获文件的事实结构,这是机器生成和人工写的文本之间的一个区别因素。为了解决这个问题,我们提出了一个基于图的模型,该模型利用文档的事实结构进行文本检测。我们的方法代表给定文档作为实体图的事实结构,该图被进一步用于使用图神经网络学习句子表示。然后将句子表示为进行预测的文档表示形式,其中临近句子之间的一致关系被依次建模。在两个公共深层数据集上实验的结果表明,我们的方法显着改善了由罗伯塔(Roberta)构建的强大基本模型。模型分析进一步表明,我们的模型可以区分机器生成的文本和人类写入文本之间的事实结构的差异。

Deepfake detection, the task of automatically discriminating machine-generated text, is increasingly critical with recent advances in natural language generative models. Existing approaches to deepfake detection typically represent documents with coarse-grained representations. However, they struggle to capture factual structures of documents, which is a discriminative factor between machine-generated and human-written text according to our statistical analysis. To address this, we propose a graph-based model that utilizes the factual structure of a document for deepfake detection of text. Our approach represents the factual structure of a given document as an entity graph, which is further utilized to learn sentence representations with a graph neural network. Sentence representations are then composed to a document representation for making predictions, where consistent relations between neighboring sentences are sequentially modeled. Results of experiments on two public deepfake datasets show that our approach significantly improves strong base models built with RoBERTa. Model analysis further indicates that our model can distinguish the difference in the factual structure between machine-generated text and human-written text.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源