CASIA OpenIR  > 模式识别国家重点实验室  > 语音交互
Investigating Deep Neural Network Adaptation for Generating Exclamatory and Interrogative Speech in Mandarin
Zheng, Yibin1,3; Li, Ya1; Wen, Zhengqi1; Liu, Bin1; Tao, Jianhua1,2,3; Jianhua Tao
2018-07-01
发表期刊JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY
卷号90期号:7页码:1039-1052
文章类型Article
摘要

Currently, most speech synthesis systems only generate speech in a reading style, which greatly affects the expressiveness of the synthetized speech. To improve the expressiveness of the synthetized speech, this paper focuses on the generation of exclamatory and interrogative speech for Mandarin spoken language. A multi-style (exclamatory and interrogative) deep neural network-based acoustic model with a style-specific layer (which can have multiple layers) and several shared hidden layers is proposed. The style-specific layer is used to model the distinct style specific patterns. The shared layers allow maximum knowledge sharing between the declarative and multi-style speech. We investigate five major aspects of the multi-style adaptation: neural network type and topology, the number of layers in style-specific layer, initial model, adaptation parameters and adaptation corpus size. Both objective and subjective evaluations are carried out to evaluate the proposed method. Experiment results show the proposed multi-style BLSTM with top one layer adapted is superior to our prior work (which is trained by the combination of constrained Maximum likelihood linear regression and structural maximum a posterior), and achieves the best performance. We also find that adapting on both spectral and excitation parameters are more effective than only adapting on the excitation parameters.

关键词Speech Synthesis Excitation Parameters Deep Neural Network Adaptation Exclamatory Speech Interrogative Speech
WOS标题词Science & Technology ; Technology
DOI10.1007/s11265-017-1290-2
关键词[WOS]Short-term-memory ; Speaker Adaptation ; Synthesis System ; Emotional Expressions ; Model ; Algorithms ; Features ; Styles ; Pitch ; Hsmm
收录类别SCI
语种英语
项目资助者National High-Tech Research and Development Program of China (863 Program)(2015AA016305) ; National Natural Science Foundation of China (NSFC)(61305003 ; Strategic Priority Research Program of the CAS(XDB02080006) ; Major Program for the National Social Science Fund of China(13ZD189) ; 61425017 ; 61403386)
WOS研究方向Computer Science ; Engineering
WOS类目Computer Science, Information Systems ; Engineering, Electrical & Electronic
WOS记录号WOS:000433555600008
引用统计
文献类型期刊论文
条目标识符http://ir.ia.ac.cn/handle/173211/19885
专题模式识别国家重点实验室_语音交互
通讯作者Jianhua Tao
作者单位1.Chinese Acad Sci Recognit, Inst Automat, Natl Lab Pattern Recognit, Beijing, Peoples R China
2.Chinese Acad Sci, Inst Automat, CAS Ctr Excellence Brain Sci & Intelligence Techn, Beijing, Peoples R China
3.Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing, Peoples R China
推荐引用方式
GB/T 7714
Zheng, Yibin,Li, Ya,Wen, Zhengqi,et al. Investigating Deep Neural Network Adaptation for Generating Exclamatory and Interrogative Speech in Mandarin[J]. JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY,2018,90(7):1039-1052.
APA Zheng, Yibin,Li, Ya,Wen, Zhengqi,Liu, Bin,Tao, Jianhua,&Jianhua Tao.(2018).Investigating Deep Neural Network Adaptation for Generating Exclamatory and Interrogative Speech in Mandarin.JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY,90(7),1039-1052.
MLA Zheng, Yibin,et al."Investigating Deep Neural Network Adaptation for Generating Exclamatory and Interrogative Speech in Mandarin".JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY 90.7(2018):1039-1052.
条目包含的文件 下载所有文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
(SCI)Investigating D(1339KB)期刊论文作者接受稿开放获取CC BY-NC-SA浏览 下载
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Zheng, Yibin]的文章
[Li, Ya]的文章
[Wen, Zhengqi]的文章
百度学术
百度学术中相似的文章
[Zheng, Yibin]的文章
[Li, Ya]的文章
[Wen, Zhengqi]的文章
必应学术
必应学术中相似的文章
[Zheng, Yibin]的文章
[Li, Ya]的文章
[Wen, Zhengqi]的文章
相关权益政策
暂无数据
收藏/分享
文件名: (SCI)Investigating Deep Neural Network Adaptation for Generating Exclamatory and Interrogative Speech in Mandarin.pdf
格式: Adobe PDF
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。