基于WAV2VEC2的VICOMTECH AUDIO DeepFake检测系统2022添加挑战

论文标题

基于WAV2VEC2的VICOMTECH AUDIO DeepFake检测系统2022添加挑战

The Vicomtech Audio Deepfake Detection System based on Wav2Vec2 for the 2022 ADD Challenge

论文作者

Martín-Doñas, Juan M., Álvarez, Aitor

论文摘要

本文描述了我们提交的2022年提交的系统增加了轨道1和2的挑战。我们的方法基于预先训练的WAV2VEC2特征提取器和下游分类器的组合，以检测欺骗的音频。该方法利用不同变压器层的上下文化语音表示形式，以完全捕获歧视性信息。此外，使用不同的数据增强技术将分类模型适应应用程序方案。我们在ASVSPOOF 2021和2022中都评估了我们的音频综合检测系统，这增加了挑战，在诸如电话和音频编解码器系统，嘈杂的音频和部分深层蛋糕等逼真的挑战环境中显示了其稳健性和良好的性能。

This paper describes our submitted systems to the 2022 ADD challenge withing the tracks 1 and 2. Our approach is based on the combination of a pre-trained wav2vec2 feature extractor and a downstream classifier to detect spoofed audio. This method exploits the contextualized speech representations at the different transformer layers to fully capture discriminative information. Furthermore, the classification model is adapted to the application scenario using different data augmentation techniques. We evaluate our system for audio synthesis detection in both the ASVspoof 2021 and the 2022 ADD challenges, showing its robustness and good performance in realistic challenging environments such as telephonic and audio codec systems, noisy audio, and partial deepfakes.

下载PDF全文

下载文献需遵守相关版权规定

论文标题