通过一步限制的光束搜索加速RNN换能器推断

论文标题

通过一步限制的光束搜索加速RNN换能器推断

Accelerating RNN Transducer Inference via One-Step Constrained Beam Search

论文作者

Kim, Juntae, Lee, Yoonhan

论文摘要

我们提出了一步受约束（OSC）梁搜索以加速复发性神经网络（RNN）换能器（RNN-T）推断。原始的RNN-T梁搜索具有一段循环，导致解码过程的速度降低。 OSC梁搜索通过矢量矢量化多个假设来消除这一点。由于原始RNN-T梁搜索中假设的扩展可能彼此不同，因此这种矢量化是不平凡的。但是，我们发现在大多数情况下，每个解码步骤仅在每个解码步骤中扩展一次。因此，我们将最大扩展数限制在一个，从而允许假设的矢量化。为了进一步加速，我们将约束分配给假设的前缀，以修剪冗余搜索空间。此外，在解码过程中，OSC梁搜索在假设之间进行了重复检查，因为重复可能不必要地缩小搜索空间。与其他音素和单词错误率较低的RNN-T光束搜索方法相比，我们实现了显着的加速。

We propose a one-step constrained (OSC) beam search to accelerate recurrent neural network (RNN) transducer (RNN-T) inference. The original RNN-T beam search has a while-loop leading to speed down of the decoding process. The OSC beam search eliminates this while-loop by vectorizing multiple hypotheses. This vectorization is nontrivial as the expansion of the hypotheses within the original RNN-T beam search can be different from each other. However, we found that the hypotheses expanded only once at each decoding step in most cases; thus, we constrained the maximum expansion number to one, thereby allowing vectorization of the hypotheses. For further acceleration, we assign constraints to the prefixes of the hypotheses to prune the redundant search space. In addition, OSC beam search has duplication check among hypotheses during the decoding process as duplication can undesirably shrink the search space. We achieved significant speedup compared with other RNN-T beam search methods with lower phoneme and word error rate.

下载PDF全文

下载文献需遵守相关版权规定

论文标题