通过不确定性驱动自我训练，无监督的领域适应语音识别

论文标题

通过不确定性驱动自我训练，无监督的领域适应语音识别

Unsupervised Domain Adaptation for Speech Recognition via Uncertainty Driven Self-Training

论文作者

Khurana, Sameer, Moritz, Niko, Hori, Takaaki, Roux, Jonathan Le

论文摘要

自动语音识别（ASR）系统的性能通常在训练和测试数据域不匹配时会大大降低。在本文中，我们表明自我训练（ST）与基于不确定性的伪标签过滤方法相结合，可以有效地用于域适应。我们提出了灰尘，这是一种基于辍学的不确定性驱动的自我训练技术，该技术使用针对不同辍学设置获得的ASR系统的多个预测之间的一致性，以衡量模型对其预测的不确定性。 Dust不包括伪造的数据，具有训练中不确定性高的伪标记数据，这与不进行ST相比，ASR结果大大改善，并且由于训练数据集的减少而加速了训练时间。使用WSJ作为源域和TED-LIUM 3以及作为目标域的调整板的域自适应实验表明，可以恢复经过培训的系统性能的80％的性能。

The performance of automatic speech recognition (ASR) systems typically degrades significantly when the training and test data domains are mismatched. In this paper, we show that self-training (ST) combined with an uncertainty-based pseudo-label filtering approach can be effectively used for domain adaptation. We propose DUST, a dropout-based uncertainty-driven self-training technique which uses agreement between multiple predictions of an ASR system obtained for different dropout settings to measure the model's uncertainty about its prediction. DUST excludes pseudo-labeled data with high uncertainties from the training, which leads to substantially improved ASR results compared to ST without filtering, and accelerates the training time due to a reduced training data set. Domain adaptation experiments using WSJ as a source domain and TED-LIUM 3 as well as SWITCHBOARD as the target domains show that up to 80% of the performance of a system trained on ground-truth data can be recovered.

下载PDF全文

下载文献需遵守相关版权规定

论文标题