CASIA OpenIR  > 模式识别国家重点实验室  > 语音交互
Investigating Deep Neural Network Adaptation for Generating Exclamatory and Interrogative Speech in Mandarin
Zheng, Yibin1,3; Li, Ya1; Wen, Zhengqi1; Liu, Bin1; Tao, Jianhua1,2,3; Jianhua Tao
Source PublicationJOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY
2018-07-01
Volume90Issue:7Pages:1039-1052
SubtypeArticle
Abstract

Currently, most speech synthesis systems only generate speech in a reading style, which greatly affects the expressiveness of the synthetized speech. To improve the expressiveness of the synthetized speech, this paper focuses on the generation of exclamatory and interrogative speech for Mandarin spoken language. A multi-style (exclamatory and interrogative) deep neural network-based acoustic model with a style-specific layer (which can have multiple layers) and several shared hidden layers is proposed. The style-specific layer is used to model the distinct style specific patterns. The shared layers allow maximum knowledge sharing between the declarative and multi-style speech. We investigate five major aspects of the multi-style adaptation: neural network type and topology, the number of layers in style-specific layer, initial model, adaptation parameters and adaptation corpus size. Both objective and subjective evaluations are carried out to evaluate the proposed method. Experiment results show the proposed multi-style BLSTM with top one layer adapted is superior to our prior work (which is trained by the combination of constrained Maximum likelihood linear regression and structural maximum a posterior), and achieves the best performance. We also find that adapting on both spectral and excitation parameters are more effective than only adapting on the excitation parameters.

KeywordSpeech Synthesis Excitation Parameters Deep Neural Network Adaptation Exclamatory Speech Interrogative Speech
WOS HeadingsScience & Technology ; Technology
DOI10.1007/s11265-017-1290-2
WOS KeywordShort-term-memory ; Speaker Adaptation ; Synthesis System ; Emotional Expressions ; Model ; Algorithms ; Features ; Styles ; Pitch ; Hsmm
Indexed BySCI
Language英语
Funding OrganizationNational High-Tech Research and Development Program of China (863 Program)(2015AA016305) ; National Natural Science Foundation of China (NSFC)(61305003 ; Strategic Priority Research Program of the CAS(XDB02080006) ; Major Program for the National Social Science Fund of China(13ZD189) ; 61425017 ; 61403386)
WOS Research AreaComputer Science ; Engineering
WOS SubjectComputer Science, Information Systems ; Engineering, Electrical & Electronic
WOS IDWOS:000433555600008
Citation statistics
Document Type期刊论文
Identifierhttp://ir.ia.ac.cn/handle/173211/19885
Collection模式识别国家重点实验室_语音交互
Corresponding AuthorJianhua Tao
Affiliation1.Chinese Acad Sci Recognit, Inst Automat, Natl Lab Pattern Recognit, Beijing, Peoples R China
2.Chinese Acad Sci, Inst Automat, CAS Ctr Excellence Brain Sci & Intelligence Techn, Beijing, Peoples R China
3.Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing, Peoples R China
Recommended Citation
GB/T 7714
Zheng, Yibin,Li, Ya,Wen, Zhengqi,et al. Investigating Deep Neural Network Adaptation for Generating Exclamatory and Interrogative Speech in Mandarin[J]. JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY,2018,90(7):1039-1052.
APA Zheng, Yibin,Li, Ya,Wen, Zhengqi,Liu, Bin,Tao, Jianhua,&Jianhua Tao.(2018).Investigating Deep Neural Network Adaptation for Generating Exclamatory and Interrogative Speech in Mandarin.JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY,90(7),1039-1052.
MLA Zheng, Yibin,et al."Investigating Deep Neural Network Adaptation for Generating Exclamatory and Interrogative Speech in Mandarin".JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY 90.7(2018):1039-1052.
Files in This Item: Download All
File Name/Size DocType Version Access License
(SCI)Investigating D(1339KB)期刊论文作者接受稿开放获取CC BY-NC-SAView Download
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Zheng, Yibin]'s Articles
[Li, Ya]'s Articles
[Wen, Zhengqi]'s Articles
Baidu academic
Similar articles in Baidu academic
[Zheng, Yibin]'s Articles
[Li, Ya]'s Articles
[Wen, Zhengqi]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Zheng, Yibin]'s Articles
[Li, Ya]'s Articles
[Wen, Zhengqi]'s Articles
Terms of Use
No data!
Social Bookmark/Share
File name: (SCI)Investigating Deep Neural Network Adaptation for Generating Exclamatory and Interrogative Speech in Mandarin.pdf
Format: Adobe PDF
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.