共振峰语音合成算法研究和实现

CASIA OpenIR > 毕业生 > 硕士学位论文

	共振峰语音合成算法研究和实现
其他题名	algorithm and implement for formantspeech synthesis
	赵亮
	2005-05-01
学位类型	工学硕士
中文摘要	当前流行的语音合成系统是采用大语料的波形拼接技术获得高质量的合成语音，基于这种技术的各种拼接算法和韵律模型及语音库的建立等方面的研究工作受到广泛关注，很多优秀的合成系统成功用于商用场合。然而这种合成方法需要大规模的语音库限制了其在PDA和手机等小型化机器上的运用。而传统的基于参数的共振峰语音合成模型所需的参数库小，可以在频域上直接对参数修改，同时对语音参数控制的灵活性使得共振峰合成可以建立在一个相对较小的参数库基础上就能合成出不同说话人风格的语音来等优点。所以本文开展了面向共振峰合成的研究工作。本文工作包括以下几个方面：1° 作者主要研究比较了各种声源参数模型的特点，在保证提取算法的可靠性和准确性的基础上，选择KLGLOTT88模型 [1]作为声源参数模型，基于最小化自然语音声源和语音声源模型之间的误差来求得模型参数，将问题转化为带约束条件的凸函数优化问题。完成了一个对自然语音自动提取声源参数的算法。实验验证实该算法能有效的提取模型参数，为下一步共振峰合成提供了可靠的声源激励。2° 针对KLATT合成器的控制参数过多使得在合成一个自然语音时,合理地设置参数变得很困难的问题，作者设计了一个辅助产生和控制模型参数的工具。3° 详细分析共振峰合成的优缺点，特别提到共振峰合成中对轻音的合成的困难，提出了混合波形拼接和共振峰合成的方案来合成一段话。听辨实验证实了该合成方法的良好效果。总之，本文针对参数语音合成过程中从对声源参数的提取算法实现作了有效的尝试和改进。
英文摘要	Recently speech synthesis system is ruled by concatenative synthesis us-ing speech waveforms corpus, based on which various concatenative algorithms,prosody models and establishment of speech database technology are widely stud-ied. Many good synthesizers have been into commercial practice. However thistechnology need a large speech database, which limits its application into PDAand mobile et.al. Traditional parametric synthesis methods such as formant-basedsynthesis can change the parameters on spectrum，meantime the °exibility ofmodifying speech parameters makes formant synthesis need a smaller parametersdatabase to synthesis different speaker style. This paper is oriented to formantsynthesis to do some research work.The main contributions of this thesis include following issues:1° We compared two vocal source parametric models，in order to achieve acredible and exact algorithm,we chose KLGLOTT88 model as a sourcemodel. The model parameters are acquired by minimizing the error betweenreal vocal source and model source.By it the estimation problem is formu-lated as a convex optimization problem.The merit of this method is compu-tational e–ciency and global optimality. Synthesis experiment demonstratethe algorithm is valid and support trusting source excitation for formantsynthesizer.2° Formant synthesis o?ers so many degrees of freedom makes it di–cult toset all of those parameters in a way that yields natural sounding speech forexperiments. A tool is designed to help deal with the problems inherent inthis large dataset of parameters in a way that is optimal for experimentalcontrol.3° Formant synthesis’s merits and °aws are in detail discussed, especially manytypes of sound such as sonorant are di–cultly synthesized by formant syn-thesizer. A method that integrates formant synthesis and waveform con-catenation is offered. listening tests validate that it can produce a good result.In a word, in this thesis, we have made a lot of fruitful attempts and signi?cantprogresses to extract source model parameters on parametric speech synthesis.
关键词	语音合成共振峰合成声源-滤波器模型参数合成 Speech Synthesis Formant Synthesis Source-filter Model Parametricsynthesis
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/6897
专题	毕业生_硕士学位论文
推荐引用方式 GB/T 7714	赵亮. 共振峰语音合成算法研究和实现[D]. 中国科学院自动化研究所. 中国科学院研究生院,2005.