不可监督的说话者诊断不可知，对语言，重叠和自由调整

论文标题

不可监督的说话者诊断不可知，对语言，重叠和自由调整

Unsupervised Speaker Diarization that is Agnostic to Language, Overlap-Aware, and Tuning Free

论文作者

Tanveer, M. Iftekhar, Casabuena, Diego, Karlgren, Jussi, Jones, Rosie

论文摘要

播客本质上是对话性的，说话者的变化很频繁 - 需要说话者诊断以了解内容。我们在不依赖语言特定组件的情况下提出了一种无监督的技术诊断技术。该算法是重叠的，不需要有关说话者数量的信息。我们的方法显示，针对播客数据上的Google Cloud Platform解决方案，纯度得分的提高了79％（F-评分为34％）。

Podcasts are conversational in nature and speaker changes are frequent -- requiring speaker diarization for content understanding. We propose an unsupervised technique for speaker diarization without relying on language-specific components. The algorithm is overlap-aware and does not require information about the number of speakers. Our approach shows 79% improvement on purity scores (34% on F-score) against the Google Cloud Platform solution on podcast data.

下载PDF全文

下载文献需遵守相关版权规定

论文标题