语言触觉声学单元发现的分层子空间模型

论文标题

语言触觉声学单元发现的分层子空间模型

A Hierarchical Subspace Model for Language-Attuned Acoustic Unit Discovery

论文作者

Yusuf, Bolaji, Ondel, Lucas, Burget, Lukas, Cernocky, Jan, Saraclar, Murat

论文摘要

在这项工作中，我们提出了一个用于声学单元发现的层次子空间模型。在这种方法中，我们将任务构架为在低维语音子空间上学习嵌入的一项，同时将子空间本身指定为嵌入超级空格上的嵌入。我们在一组转录语言上训练超级空间，然后将其传输到目标语言。在目标语言中，我们以无监督的方式来推断语言和单元嵌入，并且在这样做时，我们同时学习了特定于该语言的单元的子空间及其在其上的单位。我们进行了有关Timit和两种低资源语言的实验：Mboshi和Yoruba。结果表明，我们的模型在聚类质量和分割精度方面都优于主要的声学单元发现技术。

In this work, we propose a hierarchical subspace model for acoustic unit discovery. In this approach, we frame the task as one of learning embeddings on a low-dimensional phonetic subspace, and simultaneously specify the subspace itself as an embedding on a hyper-subspace. We train the hyper-subspace on a set of transcribed languages and transfer it to the target language. In the target language, we infer both the language and unit embeddings in an unsupervised manner, and in so doing, we simultaneously learn a subspace of units specific to that language and the units that dwell on it. We conduct our experiments on TIMIT and two low-resource languages: Mboshi and Yoruba. Results show that our model outperforms major acoustic unit discovery techniques, both in terms of clustering quality and segmentation accuracy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题