论文标题

优化重要的事情:培训DNN-HMM关键字点斑点模型

Optimize what matters: Training DNN-HMM Keyword Spotting Model Using End Metric

论文作者

Shrivastava, Ashish, Kundu, Arnav, Dhir, Chandra, Naik, Devang, Tuzel, Oncel

论文摘要

深度神经网络 - 基于隐藏的Markov模型(DNN-HMM)方法已成功用于许多始终开关的关键字示意算法,这些算法检测唤醒单词以触发设备。 DNN预测给定语音框架的状态概率,而HMM解码器结合了多个语音帧的DNN预测以计算关键字检测得分。在先前的方法中,DNN经过独立于HMM参数的训练,以最大程度地减少预测状态和地面态概率之间的跨透明镜损失。 DNN训练损失(跨凝集)与终端度量(检测得分)之间的不匹配是关键字斑点任务的次优性能的主要来源。我们通过一种新颖的端到端训练策略来解决这种损失 - 现象不匹配,该策略通过优化检测分数来学习DNN参数。为此,我们使HMM解码器(动态编程)可区分并通过它进行后启动,以最大程度地提高关键字的分数,并最大程度地减少非关键词语音语音段的分数。我们的方法不需要模型体系结构或推理框架的任何更改;因此,运行时内存或计算要求没有开销。此外,我们在相同的错误触发体验(在独立的DNN训练中> 70%)下显示出虚假拒绝率(FRR)的显着降低。

Deep Neural Network--Hidden Markov Model (DNN-HMM) based methods have been successfully used for many always-on keyword spotting algorithms that detect a wake word to trigger a device. The DNN predicts the state probabilities of a given speech frame, while HMM decoder combines the DNN predictions of multiple speech frames to compute the keyword detection score. The DNN, in prior methods, is trained independent of the HMM parameters to minimize the cross-entropy loss between the predicted and the ground-truth state probabilities. The mis-match between the DNN training loss (cross-entropy) and the end metric (detection score) is the main source of sub-optimal performance for the keyword spotting task. We address this loss-metric mismatch with a novel end-to-end training strategy that learns the DNN parameters by optimizing for the detection score. To this end, we make the HMM decoder (dynamic programming) differentiable and back-propagate through it to maximize the score for the keyword and minimize the scores for non-keyword speech segments. Our method does not require any change in the model architecture or the inference framework; therefore, there is no overhead in run-time memory or compute requirements. Moreover, we show significant reduction in false rejection rate (FRR) at the same false trigger experience (> 70% over independent DNN training).

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源