通过线性和非线性预测模型的组合来识别说话者的识别

论文标题

通过线性和非线性预测模型的组合来识别说话者的识别

Speaker recognition by means of a combination of linear and nonlinear predictive models

论文作者

Faundez-Zanuy, Marcos

论文摘要

本文介绍了非线性预测模型与经典LPCC参数化的组合，以实现说话者识别。结果表明，在LPCC系数上定义的措施的组合和在预测分析上定义的量度残留信号的定义量会导致比仅考虑LPCC系数的经典方法的改进。如果从线性预测分析获得残差信号，则改善为2.63％（错误率从6.31％降至3.68％），并且通过基于非线性预测性神经网的模型进行计算，则改善为3.68％。还提出了一种减少计算负担的有效算法。

This paper deals the combination of nonlinear predictive models with classical LPCC parameterization for speaker recognition. It is shown that the combination of both a measure defined over LPCC coefficients and a measure defined over predictive analysis residual signal gives rise to an improvement over the classical method that considers only the LPCC coefficients. If the residual signal is obtained from a linear prediction analysis, the improvement is 2.63% (error rate drops from 6.31% to 3.68%) and if it is computed through a nonlinear predictive neural nets based model, the improvement is 3.68%. An efficient algorithm for reducing the computational burden is also proposed.

下载PDF全文

下载文献需遵守相关版权规定

论文标题