深层多任务网络，用于延迟估计和回声取消

论文标题

深层多任务网络，用于延迟估计和回声取消

Deep Multi-task Network for Delay Estimation and Echo Cancellation

论文作者

Zhang, Yi, Deng, Chengyun, Ma, Shiqian, Sha, Yongtao, Song, Hui

论文摘要

回声路径延迟（或参考延迟）估计是声音消除的巨大挑战。不同的设备可能会在实践中引入各种参考。 ref-delay的不一致会减慢自适应过滤器的融合，并由于训练集中的“看不见”的参考文献而降低了深度学习模型的性能。在本文中，提出了一个多任务网络，以解决参考估计和回声取消任务。所提出的架构由两个卷积复发网络（CRNN）组成，以分别估算回声和增强信号，以及一个完全连接的（FC）网络，以估计回声路径延迟。首先预测回声信号，然后将参考信号与参考信号结合在一起以进行延迟估计。最后，使用延迟补偿参考和麦克风信号来预测增强的目标信号。实验结果表明，就回声回报损失增强（ERLE）和语音质量的知觉评估而言，该提出的方法在不一致的回声路径延迟方案中，在不一致的回声路径延迟方案中胜过可靠的延迟估计，并优于现有的最新解决方案（PESQ）。此外，还研究了一种数据增强方法，以通过人工引入的参考延迟来评估综合数据不同部分的模型性能。

Echo path delay (or ref-delay) estimation is a big challenge in acoustic echo cancellation. Different devices may introduce various ref-delay in practice. Ref-delay inconsistency slows down the convergence of adaptive filters, and also degrades the performance of deep learning models due to 'unseen' ref-delays in the training set. In this paper, a multi-task network is proposed to address both ref-delay estimation and echo cancellation tasks. The proposed architecture consists of two convolutional recurrent networks (CRNNs) to estimate the echo and enhanced signals separately, as well as a fully-connected (FC) network to estimate the echo path delay. Echo signal is first predicted, and then is combined with reference signal together for delay estimation. At the end, delay compensated reference and microphone signals are used to predict the enhanced target signal. Experimental results suggest that the proposed method makes reliable delay estimation and outperforms the existing state-of-the-art solutions in inconsistent echo path delay scenarios, in terms of echo return loss enhancement (ERLE) and perceptual evaluation of speech quality (PESQ). Furthermore, a data augmentation method is studied to evaluate the model performance on different portion of synthetical data with artificially introduced ref-delay.

下载PDF全文

下载文献需遵守相关版权规定

论文标题