论文标题

部分可观测时空混沌系统的无模型预测

A Brief Overview of Unsupervised Neural Speech Representation Learning

论文作者

Borgholt, Lasse, Havtorn, Jakob Drachmann, Edin, Joakim, Maaløe, Lars, Igel, Christian

论文摘要

在过去的几年中,无监督的语音处理的代表性学习已经大大成熟。计算机视觉和自然语言处理的工作已经铺平了道路,但是语音数据带来了独特的挑战。结果,来自其他域的方法很少直接翻译。我们回顾了过去十年中无监督的代表性学习语音学习的发展。我们确定了两个主要模型类别:自我监督方法和概率潜在变量模型。我们描述了模型并开发全面的分类法。最后,我们讨论并比较了这两个类别的模型。

Unsupervised representation learning for speech processing has matured greatly in the last few years. Work in computer vision and natural language processing has paved the way, but speech data offers unique challenges. As a result, methods from other domains rarely translate directly. We review the development of unsupervised representation learning for speech over the last decade. We identify two primary model categories: self-supervised methods and probabilistic latent variable models. We describe the models and develop a comprehensive taxonomy. Finally, we discuss and compare models from the two categories.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源