使用动态损失门和标签校正的自我监督的扬声器验证

论文标题

使用动态损失门和标签校正的自我监督的扬声器验证

Self-Supervised Speaker Verification Using Dynamic Loss-Gate and Label Correction

论文作者

Han, Bing, Chen, Zhengyang, Qian, Yanmin

论文摘要

对于自我监督的扬声器验证，伪标签的质量由于标签庞大而决定了系统的上限。在这项工作中，我们提出了动态损失门和标签校正（DLG-LC），以减轻由不可靠的估计标签引起的性能降解。在DLG中，我们采用高斯混合模型（GMM），以动态建模损耗分布并使用估计的GMM自动区分可靠和不可靠的标签。此外，要更好地利用不可靠的数据而不是直接删除它们，我们使用模型预测校正了不可靠的标签。此外，我们在实验中应用了无负对的恐龙框架以进一步改进。与最著名的扬声器验证系统相比，我们提出的DLG-LC收敛速度更快，并在VoxCeleb1评估数据集的Vox-O，Vox-E和Vox-H试验中相对相对相对相对改善的速度为11.45％，18.35％和15.16％。

For self-supervised speaker verification, the quality of pseudo labels decides the upper bound of the system due to the massive unreliable labels. In this work, we propose dynamic loss-gate and label correction (DLG-LC) to alleviate the performance degradation caused by unreliable estimated labels. In DLG, we adopt Gaussian Mixture Model (GMM) to dynamically model the loss distribution and use the estimated GMM to distinguish the reliable and unreliable labels automatically. Besides, to better utilize the unreliable data instead of dropping them directly, we correct the unreliable label with model predictions. Moreover, we apply the negative-pairs-free DINO framework in our experiments for further improvement. Compared to the best-known speaker verification system with self-supervised learning, our proposed DLG-LC converges faster and achieves 11.45%, 18.35% and 15.16% relative improvement on Vox-O, Vox-E and Vox-H trials of Voxceleb1 evaluation dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题