论文标题

泰勒,你现在能听到我的声音吗?泰勒 - 单语言言语增强的框架

Taylor, Can You Hear Me Now? A Taylor-Unfolding Framework for Monaural Speech Enhancement

论文作者

Li, Andong, You, Shan, Yu, Guochen, Zheng, Chengshi, Li, Xiaodong

论文摘要

尽管深度学习技术促进了演讲增强(SE)社区的快速发展,但大多数方案仅以黑盒方式追求性能,并且缺乏足够的模型可解释性。受泰勒(Taylor)的近似理论的启发,我们提出了一个可解释的脱钩式SE框架,该框架将复杂频谱恢复分为两个单独的优化问题\ emph {i.e。},幅度和复杂的残差估计。具体而言,用作泰勒(Taylor)系列中的0阶项,精巧地设计了一个滤波器网络,以抑制大小域中的噪声分量并获得粗频谱。为了完善相分布,我们估计稀疏的复合物残差,该残差定义为目标和粗谱之间的差异,并测量相位间隙。在这项研究中,我们将残留成分作为各种高阶泰勒术语的组合提出,并提出了一个可轻质训练的模块,以替代相邻项之间复杂的衍生算子。最后,按照泰勒的公式,我们可以通过0阶和高阶项之间的叠加来重建目标光谱。两个基准数据集的实验结果表明,在各种评估指标中,我们的框架比以前的竞争基线实现了最新的性能。源代码可在github.com/andong-lispeech/taylorsenet上获得。

While the deep learning techniques promote the rapid development of the speech enhancement (SE) community, most schemes only pursue the performance in a black-box manner and lack adequate model interpretability. Inspired by Taylor's approximation theory, we propose an interpretable decoupling-style SE framework, which disentangles the complex spectrum recovery into two separate optimization problems \emph{i.e.}, magnitude and complex residual estimation. Specifically, serving as the 0th-order term in Taylor's series, a filter network is delicately devised to suppress the noise component only in the magnitude domain and obtain a coarse spectrum. To refine the phase distribution, we estimate the sparse complex residual, which is defined as the difference between target and coarse spectra, and measures the phase gap. In this study, we formulate the residual component as the combination of various high-order Taylor terms and propose a lightweight trainable module to replace the complicated derivative operator between adjacent terms. Finally, following Taylor's formula, we can reconstruct the target spectrum by the superimposition between 0th-order and high-order terms. Experimental results on two benchmark datasets show that our framework achieves state-of-the-art performance over previous competing baselines in various evaluation metrics. The source code is available at github.com/Andong-Lispeech/TaylorSENet.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源