论文标题
花花公子:使用常见标签集的双学位多语言ASR用于印度语言
DuDe: Dual-Decoder Multilingual ASR for Indian Languages using Common Label Set
论文作者
论文摘要
在像印度这样的多语言国家中,多语言自动语音识别(ASR)系统具有很大的范围。多语言ASR系统具有许多优势,例如可伸缩性,可维护性和在单语ASR系统中的性能提高。但是,为印度语言构建多语言系统是具有挑战性的,因为不同的语言使用不同的脚本进行写作。另一方面,印度语言有很多常见的声音。通用标签集(CLS)利用了这个想法和映射各种语言的图形,其声音与常见标签相似。由于印度语言大多是语音,因此构建一个解析器以从本机脚本转换为CLS很容易。在本文中,我们探讨了构建多语言ASR模型的各种方法。我们还提出了一个新型的架构,称为编码器编码器,用于构建使用CLS和本机脚本标签的多语言系统。我们还分析了基于CLS的多语言系统与机器音译结合的有效性。
In a multilingual country like India, multilingual Automatic Speech Recognition (ASR) systems have much scope. Multilingual ASR systems exhibit many advantages like scalability, maintainability, and improved performance over the monolingual ASR systems. However, building multilingual systems for Indian languages is challenging since different languages use different scripts for writing. On the other hand, Indian languages share a lot of common sounds. Common Label Set (CLS) exploits this idea and maps graphemes of various languages with similar sounds to common labels. Since Indian languages are mostly phonetic, building a parser to convert from native script to CLS is easy. In this paper, we explore various approaches to build multilingual ASR models. We also propose a novel architecture called Encoder-Decoder-Decoder for building multilingual systems that use both CLS and native script labels. We also analyzed the effectiveness of CLS-based multilingual systems combined with machine transliteration.