开放设定的简短话语法医扬声器使用具有明确偏见的教师学生网络验证

论文标题

开放设定的简短话语法医扬声器使用具有明确偏见的教师学生网络验证

Open-set Short Utterance Forensic Speaker Verification using Teacher-Student Network with Explicit Inductive Bias

论文作者

Sang, Mufan, Xia, Wei, Hansen, John H. L.

论文摘要

在法医应用中，只有在复杂或未知的声学环境中的简短话语组成的小型自然主义数据集非常普遍。在这项研究中，我们提出了一种管道解决方案，以改善对小型实际取证现场数据集的扬声器验证。通过利用大规模的室外数据集，提出了基于知识蒸馏的目标功能来进行教师学习，该目标适用于简短的话语法医扬声器验证。目标函数共同考虑说话者的分类损失，kullback-leibler差异和嵌入的相似性。为了使训练有素的深扬声器嵌入网络对小型目标数据集具有鲁棒性，我们引入了一种新颖的策略，将预训练的学生模型调整为法医目标域，通过将模型作为鉴定起点起点和参考的正则化。提出的方法在1st48-UTD法医语料库上进行了评估，这是一个新建立的自然主义数据集的实际杀人案调查数据集，该数据集由在不受控制的条件下记录的简短话语组成。我们表明，提出的目标功能可以有效地提高教师学习在短语上的表现，并且我们的微调策略通过对预先训练的模型提供明确的归纳偏见来优于常用的重量衰变方法。

In forensic applications, it is very common that only small naturalistic datasets consisting of short utterances in complex or unknown acoustic environments are available. In this study, we propose a pipeline solution to improve speaker verification on a small actual forensic field dataset. By leveraging large-scale out-of-domain datasets, a knowledge distillation based objective function is proposed for teacher-student learning, which is applied for short utterance forensic speaker verification. The objective function collectively considers speaker classification loss, Kullback-Leibler divergence, and similarity of embeddings. In order to advance the trained deep speaker embedding network to be robust for a small target dataset, we introduce a novel strategy to fine-tune the pre-trained student model towards a forensic target domain by utilizing the model as a finetuning start point and a reference in regularization. The proposed approaches are evaluated on the 1st48-UTD forensic corpus, a newly established naturalistic dataset of actual homicide investigations consisting of short utterances recorded in uncontrolled conditions. We show that the proposed objective function can efficiently improve the performance of teacher-student learning on short utterances and that our fine-tuning strategy outperforms the commonly used weight decay method by providing an explicit inductive bias towards the pre-trained model.

下载PDF全文

下载文献需遵守相关版权规定

论文标题