论文标题
唱歌语音综合的参数表示:比较评估
Parametric Representation for Singing Voice Synthesis: a Comparative Evaluation
论文作者
论文摘要
已经提出了各种参数表示来对语音信号进行建模。虽然此类声码器的表现在语音处理的背景下是众所周知的,但它们推断出唱歌的语音综合可能并不简单。本文的目标是双重的。首先,在适用于统计参数合成的四种现有技术上进行了比较主观评估:传统的脉冲声码器,确定性和随机模型,谐波加上噪声模型和闪光。研究了这些技术与歌手类型(男中音,反居民和女高音)的函数的行为。其次,讨论了在高音调声音中发生的伪影,并建议采用克服它们的方法。
Various parametric representations have been proposed to model the speech signal. While the performance of such vocoders is well-known in the context of speech processing, their extrapolation to singing voice synthesis might not be straightforward. The goal of this paper is twofold. First, a comparative subjective evaluation is performed across four existing techniques suitable for statistical parametric synthesis: traditional pulse vocoder, Deterministic plus Stochastic Model, Harmonic plus Noise Model and GlottHMM. The behavior of these techniques as a function of the singer type (baritone, counter-tenor and soprano) is studied. Secondly, the artifacts occurring in high-pitched voices are discussed and possible approaches to overcome them are suggested.