使用转移学习和各种信息瓶颈的转移学习

论文标题

使用转移学习和各种信息瓶颈的转移学习

Anti-Spoofing Using Transfer Learning with Variational Information Bottleneck

论文作者

Eom, Youngsik, Lee, Yeonghyeon, Um, Ji Sub, Kim, Hoirin

论文摘要

从文本到语音（TTS）或语音转换（VC）系统产生的复杂综合语音的最新进展会对现有的自动扬声器验证（ASV）系统造成威胁。由于这种综合语音是由不同算法产生的，因此使用有限的训练数据具有限制性的抗疾病系统是必不可少的。在这项工作中，我们提出了一种基于WAV2VEC 2.0预验证的模型的转移学习方案，该模型具有多变量信息瓶颈（VIB），用于语音反欺骗任务。对ASVSPOOF 2019逻辑访问（LA）数据库的评估表明，我们的方法提高了区分看不见的欺骗和真实语音的性能，表现优于当前最新的反欺骗系统。此外，我们表明所提出的系统可显着改善反欺骗任务的低资源和跨数据库设置的性能，这表明我们的系统在数据大小和数据分布方面也很强大。

Recent advances in sophisticated synthetic speech generated from text-to-speech (TTS) or voice conversion (VC) systems cause threats to the existing automatic speaker verification (ASV) systems. Since such synthetic speech is generated from diverse algorithms, generalization ability with using limited training data is indispensable for a robust anti-spoofing system. In this work, we propose a transfer learning scheme based on the wav2vec 2.0 pretrained model with variational information bottleneck (VIB) for speech anti-spoofing task. Evaluation on the ASVspoof 2019 logical access (LA) database shows that our method improves the performance of distinguishing unseen spoofed and genuine speech, outperforming current state-of-the-art anti-spoofing systems. Furthermore, we show that the proposed system improves performance in low-resource and cross-dataset settings of anti-spoofing task significantly, demonstrating that our system is also robust in terms of data size and data distribution.

下载PDF全文

下载文献需遵守相关版权规定

论文标题