丰富代表性不足的指定性来提高语音识别表现

论文标题

丰富代表性不足的指定性来提高语音识别表现

Enriching Under-Represented Named-Entities To Improve Speech Recognition Performance

论文作者

Mao, Tingzhi, Khassanov, Yerbolat, Pham, Van Tung, Xu, Haihua, Huang, Hao, Wumaier, Aishan, Chng, Eng Siong

论文摘要

由于这种命名性（NE）（NE）的实例不足，并且在培训数据中的上下文覆盖范围不足，因此对代表性不足的命名实体（UR-NE）的自动语音识别（ASR）在学习可靠的估计和表示方面的差异不足。在本文中，我们提出了丰富UR-NES以提高语音识别性能的方法。具体来说，我们的首要任务是确保如果有的话，这些ur-nes出现在晶格中。为此，我们根据这些类别（例如，位置，人，组织等）为这些UR-NE制作了示例性话语，最终以改进的语言模型（LM）来提高lattice一词中的ur-ne出现。随着晶格中出现更多的UR-NES，我们通过晶格抛弃方法提高了识别性能。我们首先通过借用富代表性NES（RR-NES）的嵌入表示形式来丰富您在预训练的复发性神经网络LM（RNNLM）中的表示，从而产生了统计上偏爱UR-NES的晶格。最后，我们直接提高了包含UR-NE的话语的可能性得分，并取得了进一步的提高。

Automatic speech recognition (ASR) for under-represented named-entity (UR-NE) is challenging due to such named-entities (NE) have insufficient instances and poor contextual coverage in the training data to learn reliable estimates and representations. In this paper, we propose approaches to enriching UR-NEs to improve speech recognition performance. Specifically, our first priority is to ensure those UR-NEs to appear in the word lattice if there is any. To this end, we make exemplar utterances for those UR-NEs according to their categories (e.g. location, person, organization, etc.), ending up with an improved language model (LM) that boosts the UR-NE occurrence in the word lattice. With more UR-NEs appearing in the lattice, we then boost the recognition performance through lattice rescoring methods. We first enrich the representations of UR-NEs in a pre-trained recurrent neural network LM (RNNLM) by borrowing the embedding representations of the rich-represented NEs (RR-NEs), yielding the lattices that statistically favor the UR-NEs. Finally, we directly boost the likelihood scores of the utterances containing UR-NEs and gain further performance improvement.

下载PDF全文

下载文献需遵守相关版权规定

论文标题