英文摘要 | With the rapid development of communication science and computer science, Linguistic Engineering has got unprecedented cognizance and concern. Speech synthesis is an important component in linguistic Engineering. It not only plays a critical role in speech communication, but also has significant contribution to the research of phonetics and phonetic physiology. Speech is a preferable natural means of man-machine interaction. As a significant part of it, speech synthesis has been a hot spot in speech technology. The development of speech synthesis has a history of more than 200 years, passing through mechanical, electronic and digital phases, undergoing the process from burgeoning to maturing, from basic research in laboratory to fledged technology in market. In domestic the outset of research in speech synthesis is comparatively late, but progresses with rapid speed. It has stepped into the crucial stage of coming out of lab and entering the market. From the point of practical use, people bring forth requirement for more natural and man- like synthesis speech. But there is still a big gap between the requirements and what we have got. In general, it involves two aspects as improving the naturalness of synthesis speech is concerned: one is to find out a good speech synthesizer, which can effectively realize the prosodic modification with high quality; the other is to improve the quality of prosodic model. We need some kind of simple and effective model, which acts as a tool to deal with such complex rules for high level prosody as intonation, accent, stress and so on. Based on the cognizance of the kernel elements of speech synthesis, the author read a lot of papers and literature to trace the advanced technologies of the world in this field. Following are lists of work, carried out during the last two years for the thesis: 1, Do auto pitch synchronous annotation for omni-monosyllable lib, correct the incorrect annotation by hand. 2, Develop a novel synthesis algorithm, named Time-Domain Pitch-Synchronous Point- to-Point Model. 2, Realize a synthesizer, based on Harmonic plus Noise model with pitch-synchronization information. 4, Set up a rule-based prosodic model. 5, Train an ANN for duration generation. 6, Build the output module, speech synthesis, for dialog system named LODESTAR. 7, Integrate a large vocabulary mandarin Text-to-Speech Synthesis System. Subsequently, the thesis is organized with the order of synthesis algorithm, prosodic model and system introduction. |
修改评论