论文标题
用于深击视频检测的基于LSTM的卷积LSTM剩余网络
A Convolutional LSTM based Residual Network for Deepfake Video Detection
论文作者
论文摘要
近年来,基于深度学习的视频操纵方法已被大众广泛使用。几乎无需努力,人们就可以轻松地学习如何仅使用几个受害者或目标图像来生成深层视频。这为每个人在互联网上公开可用的每个人,尤其是在社交媒体网站上公开可用的每个人都会造成一个重大的社交问题。已经开发了几种基于深度学习的检测方法来识别这些深击。但是,这些方法缺乏普遍性,因为它们仅针对特定类型的深泡方法表现良好。因此,这些方法无法转移以检测其他深泡方法。另外,他们不会利用视频的时间信息。在本文中,我们解决了这些局限性。我们开发了基于卷积LSTM的残留网络(CLRNET),该网络将连续图像作为视频的输入来学习时间信息,以帮助检测深击视频框架之间存在的不自然外观的文物。我们还提出了一种基于转移学习的方法,以概括不同的深泡方法。通过使用FaceForensics ++数据集进行严格的实验,我们表明我们的方法的表现优于先前提出的最先进的深层检测方法中的五个,通过更好地概括使用相同模型来检测不同的深击方法。
In recent years, deep learning-based video manipulation methods have become widely accessible to masses. With little to no effort, people can easily learn how to generate deepfake videos with only a few victims or target images. This creates a significant social problem for everyone whose photos are publicly available on the Internet, especially on social media websites. Several deep learning-based detection methods have been developed to identify these deepfakes. However, these methods lack generalizability, because they perform well only for a specific type of deepfake method. Therefore, those methods are not transferable to detect other deepfake methods. Also, they do not take advantage of the temporal information of the video. In this paper, we addressed these limitations. We developed a Convolutional LSTM based Residual Network (CLRNet), which takes a sequence of consecutive images as an input from a video to learn the temporal information that helps in detecting unnatural looking artifacts that are present between frames of deepfake videos. We also propose a transfer learning-based approach to generalize different deepfake methods. Through rigorous experimentations using the FaceForensics++ dataset, we showed that our method outperforms five of the previously proposed state-of-the-art deepfake detection methods by better generalizing at detecting different deepfake methods using the same model.