汉语语音合成算法及韵律模型的研究

CASIA OpenIR > 毕业生 > 硕士学位论文

	汉语语音合成算法及韵律模型的研究
	黄艳
	1999-06-01
学位类型	工学硕士
中文摘要	在信息科学和计算机科学迅速发展的今天，言语工程受到前所未有的重视，语音合成是言语工程的重要组成部分，它不仅在人机通信中充当重要角色，而且对语音学和语音生理学的研究都有重要的意义。语音交互方式是倍受人们亲睐的自然交互方式，语音合成作为其中的重要—部分也成为语音技术发展的一大热点。语音合成已有200百多年的历史，经历了机械、电子和数字三个阶段，完成了从萌芽到成熟、从实验室技术到市场产品化的成长过程。国内汉语语音合成起步比较晚，但是发展很快，目前汉语合成技术正处在由实验室走向市场的关键阶段。人们从应用角度对语音合成提出了更高的要求，人们希望机器发出的声音能更自然、更贴近人的发音，而目前汉语语音合成系统与这一要求还有较大差距。一般认为提高合成自然度主要应解决两方面问题：一是提高合成器质量，当韵律模块给出韵律参数后，合成器如何高质地实现这些韵律；二是提高韵律模块质量，我们需要简洁定量地研究模型为技术手段，来对复杂的语调规则和重音、轻读、强调等深层次的韵律变化进行处理。基于对语音合成核心问题的认识，笔者阅读大量的文献，紧跟世界先进合成技术，在具备的实验条件下开展了以下几项工作： 1、对汉字全音节库(1267个单音节)作了自动基音同步标注，手工校对。 2、开发了一种新的合成算法：基音同步点对应合成算法(TD—PSPTP)。． 3、实现有基音同步信息的谐波加噪声合成算法。 4、分析量化韵律规则，建立基于规则的韵律模型。 5、训练了一个自动产生时长的神经网络。 5、在以上工作基础上，完成了LODESTAR旅游信息咨询系统的输出模块：语音合成模块。 6、集成了一个大词集的汉语文语转换实验系统。下文将按照合成算法、韵律模型、系统集成的思路依次介绍论文工作。
英文摘要	With the rapid development of communication science and computer science, Linguistic Engineering has got unprecedented cognizance and concern. Speech synthesis is an important component in linguistic Engineering. It not only plays a critical role in speech communication, but also has significant contribution to the research of phonetics and phonetic physiology. Speech is a preferable natural means of man-machine interaction. As a significant part of it, speech synthesis has been a hot spot in speech technology. The development of speech synthesis has a history of more than 200 years, passing through mechanical, electronic and digital phases, undergoing the process from burgeoning to maturing, from basic research in laboratory to fledged technology in market. In domestic the outset of research in speech synthesis is comparatively late, but progresses with rapid speed. It has stepped into the crucial stage of coming out of lab and entering the market. From the point of practical use, people bring forth requirement for more natural and man- like synthesis speech. But there is still a big gap between the requirements and what we have got. In general, it involves two aspects as improving the naturalness of synthesis speech is concerned: one is to find out a good speech synthesizer, which can effectively realize the prosodic modification with high quality; the other is to improve the quality of prosodic model. We need some kind of simple and effective model, which acts as a tool to deal with such complex rules for high level prosody as intonation, accent, stress and so on. Based on the cognizance of the kernel elements of speech synthesis, the author read a lot of papers and literature to trace the advanced technologies of the world in this field. Following are lists of work, carried out during the last two years for the thesis: 1, Do auto pitch synchronous annotation for omni-monosyllable lib, correct the incorrect annotation by hand. 2, Develop a novel synthesis algorithm, named Time-Domain Pitch-Synchronous Point- to-Point Model. 2, Realize a synthesizer, based on Harmonic plus Noise model with pitch-synchronization information. 4, Set up a rule-based prosodic model. 5, Train an ANN for duration generation. 6, Build the output module, speech synthesis, for dialog system named LODESTAR. 7, Integrate a large vocabulary mandarin Text-to-Speech Synthesis System. Subsequently, the thesis is organized with the order of synthesis algorithm, prosodic model and system introduction.
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/7274
专题	毕业生_硕士学位论文
推荐引用方式 GB/T 7714	黄艳. 汉语语音合成算法及韵律模型的研究[D]. 中国科学院自动化研究所. 中国科学院自动化研究所,1999.