Currently, Most Text-to-Speech system can synthesize speech only in a reading style, which greatly limited the application of TTS. To improve the expressiveness of TTS and to enlarge the application of TTS, this paper focuses on question and exclamantory speech in spoken language, and prosody evaluation is also investigated. Based on the analysis of the exclamantory speech with modal tags and question speech, a unit selection based method is proposed to generate exclamantory and question speech by constructing new prosody templates and target cost function. Besides, a prosody conversion based method is proposed to simulate excalmantory speech and question speech. Prosody evaluation is an important part of speech evaluation. In the paper, based on the analysis of prosody variability, a method of is proposed to automatically evaluate prosody quality. My achievements of this paper are as follows: In this paper, exclamantory speech with modal tags is analyzed and four types of question speech are also analyzed based on parallel speech. From the results, there are a few strong stresses in exclamantory speech with modal tags. Besides, question tag word and modal exclamation will increase neighbor’ F0 and decrease their duration. Generating exclamantory speech and question speech based on unit selection. Based on the analysis of question speech and exclamantory speech with modal tags, a new target cost function is constructed and new features are integrated into prosody templates. Experimental results show that the synthesized speech is of high quality and strong mood. Synthesizing exclamantory speech and question speech based on prosody conversion. Based on the prosodic analysis of question and exclamanatory speech, a new prosody conversion model is constructed with CART model to simulate the prosody features of exclamatory speech with modal words and question speech. Final perception and comparison experiments show that the models proposed can be used to synthesize the speech with high quality. Besides, it is showed that the method is valid. Prosody evaluation is an essential part of speech evaluation. The paper analyzes the prosodic variability among inter-speakers based on a speech database containing eight repetitions of sentences. For Mandarin, prosody variability can be analyzed from rhythm, intonation and tone variation, which is very difficult to automatically evaluate prosody quality. Based on these variations, the comparison between the tested an...
修改评论