论文标题

从本地依赖性学习语音表示的非自动入学预测编码

Non-Autoregressive Predictive Coding for Learning Speech Representations from Local Dependencies

论文作者

Liu, Alexander H., Chung, Yu-An, Glass, James

论文摘要

自我监督的语音表征已被证明在各种语音应用中有效。但是,现有的表示学习方法通​​常依赖于自回归模型和/或观察到的全球依赖性,同时产生表示。在这项工作中,我们提出了一种自我监督的方法,提出了非自动入学的预测编码(NPC),以仅依靠语音的局部依赖性来以非自动性的方式学习语音表示。 NPC具有一个概念上简单的目标,可以通过引入的蒙版卷积块轻松实现。 NPC为推断提供了显着的加速,因为它在时间上是可行的,并且对于每个时间步的固定推理时间,无论输入序列长度如何。我们通过理论和经验将其与其他方法进行比较,讨论和验证NPC的有效性。我们表明,NPC表示与语音和扬声器分类的语音实验中的其他方法相媲美,同时更有效。

Self-supervised speech representations have been shown to be effective in a variety of speech applications. However, existing representation learning methods generally rely on the autoregressive model and/or observed global dependencies while generating the representation. In this work, we propose Non-Autoregressive Predictive Coding (NPC), a self-supervised method, to learn a speech representation in a non-autoregressive manner by relying only on local dependencies of speech. NPC has a conceptually simple objective and can be implemented easily with the introduced Masked Convolution Blocks. NPC offers a significant speedup for inference since it is parallelizable in time and has a fixed inference time for each time step regardless of the input sequence length. We discuss and verify the effectiveness of NPC by theoretically and empirically comparing it with other methods. We show that the NPC representation is comparable to other methods in speech experiments on phonetic and speaker classification while being more efficient.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源