语音分离，多个扬声器数量未知

论文标题

语音分离，多个扬声器数量未知

Voice Separation with an Unknown Number of Multiple Speakers

论文作者

Nachmani, Eliya, Adi, Yossi, Wolf, Lior

论文摘要

我们提出了一种分离混合音频序列的新方法，其中多种声音同时说话。该新方法采用封闭式神经网络，经过训练，可以在多个处理步骤中分离声音，同时在每个输出通道中保持固定的扬声器。为每个可能的扬声器培训了不同的模型，并且使用最多的扬声器的模型选择给定样本中的实际扬声器数量。我们的方法极大地胜过当前的艺术状态，正如我们所展示的那样，对于两个以上的演讲者而言，它的竞争力不大。

We present a new method for separating a mixed audio sequence, in which multiple voices speak simultaneously. The new method employs gated neural networks that are trained to separate the voices at multiple processing steps, while maintaining the speaker in each output channel fixed. A different model is trained for every number of possible speakers, and the model with the largest number of speakers is employed to select the actual number of speakers in a given sample. Our method greatly outperforms the current state of the art, which, as we show, is not competitive for more than two speakers.

下载PDF全文

下载文献需遵守相关版权规定

论文标题