论文标题
IEEE SLT 2021 Alpha-Mini语音挑战:打开数据集,轨道,规则和基线
IEEE SLT 2021 Alpha-mini Speech Challenge: Open Datasets, Tracks, Rules and Baselines
论文作者
论文摘要
IEEE口语技术研讨会(SLT)2021 Alpha-Mini语音挑战(ASC)旨在改善人体机器人机器人对关键字发现(KWS)和声音源位置(SSL)的研究。许多出版物报告了近年来在开源数据集上基于深度学习的KWS和SSL的重大改进。对于深度学习模型培训,有必要扩大数据覆盖范围以提高模型的鲁棒性。因此,广泛采用了单渠道语音,噪声,回声和式冲动响应(RIR)的多渠道嘈杂和回响数据(RIR)。但是,在实际应用程序方案,尤其是ECHO数据中,此方法可能会在模拟数据和记录数据之间产生不匹配。在这一挑战中,我们为促进数据驱动的方法,尤其是KWS和SSL的深度学习方法开源了大量的语音,关键字,回声和噪声语料库。我们还选择了Alpha-Mini,这是由UBTECH生产的类人类机器人,该机器人在其头上配备了内置的四微波管阵列,以记录在实际Alpha-Mini机器人应用程序场景下的开发和评估集,包括噪声,ECHO和机械噪声,由机器人本身用于模型评估。此外,我们说明了研究人员快速评估其成就并优化其模型的规则,评估方法和基准。
The IEEE Spoken Language Technology Workshop (SLT) 2021 Alpha-mini Speech Challenge (ASC) is intended to improve research on keyword spotting (KWS) and sound source location (SSL) on humanoid robots. Many publications report significant improvements in deep learning based KWS and SSL on open source datasets in recent years. For deep learning model training, it is necessary to expand the data coverage to improve the robustness of model. Thus, simulating multi-channel noisy and reverberant data from single-channel speech, noise, echo and room impulsive response (RIR) is widely adopted. However, this approach may generate mismatch between simulated data and recorded data in real application scenarios, especially echo data. In this challenge, we open source a sizable speech, keyword, echo and noise corpus for promoting data-driven methods, particularly deep-learning approaches on KWS and SSL. We also choose Alpha-mini, a humanoid robot produced by UBTECH equipped with a built-in four-microphone array on its head, to record development and evaluation sets under the actual Alpha-mini robot application scenario, including noise as well as echo and mechanical noise generated by the robot itself for model evaluation. Furthermore, we illustrate the rules, evaluation methods and baselines for researchers to quickly assess their achievements and optimize their models.