Hubert-Ee：提前退出Hubert以获得高效的语音识别

论文标题

Hubert-Ee：提前退出Hubert以获得高效的语音识别

HuBERT-EE: Early Exiting HuBERT for Efficient Speech Recognition

论文作者

Yoon, Ji Won, Woo, Beom Jun, Kim, Nam Soo

论文摘要

通过自我监督模型（例如Hidden-Unit Bert（Hubert）和WAV2VEC 2.0）进行预训练，已在自动语音识别（ASR）方面取得了重大改进。但是，这些模型通常需要昂贵的计算成本才能实现出色的性能，从而降低了推理速度。为了提高模型效率，我们引入了ASR的早期退出方案，即Hubert-EE，该方案允许模型动态停止推理。在Hubert-EE中，在中间层中添加了多个早期出口分支。当早期出口分支的中间预测有信心时，模型停止了推理，并且可以提早返回相应的结果。我们调查适当的早期退出标准和微调策略，以有效地执行早期退出。 LibrisPeech上的实验结果表明，Hubert-EE可以加速Hubert的推断，同时平衡性能和潜伏期之间的权衡。

Pre-training with self-supervised models, such as Hidden-unit BERT (HuBERT) and wav2vec 2.0, has brought significant improvements in automatic speech recognition (ASR). However, these models usually require an expensive computational cost to achieve outstanding performance, slowing down the inference speed. To improve the model efficiency, we introduce an early exit scheme for ASR, namely HuBERT-EE, that allows the model to stop the inference dynamically. In HuBERT-EE, multiple early exit branches are added at the intermediate layers. When the intermediate prediction of the early exit branch is confident, the model stops the inference, and the corresponding result can be returned early. We investigate the proper early exiting criterion and fine-tuning strategy to effectively perform early exiting. Experimental results on the LibriSpeech show that HuBERT-EE can accelerate the inference of the HuBERT while simultaneously balancing the trade-off between the performance and the latency.

下载PDF全文

下载文献需遵守相关版权规定

论文标题