CASIA OpenIR  > 数字内容技术与服务研究中心  > 听觉模型与认知计算
COMBINING UNIDIRECTIONAL LONG SHORT-TERM MEMORY WITH CONVOLUTIONAL OUTPUT LAYER FOR HIGH-PERFORMANCE SPEECH SYNTHESIS
Wang, Wenfu; Xu, Bo
2017-03
Conference NameInternational Conference on Acoustics, Speech and Signal Processing
Pages5500-5504
Conference Date2017-3-5
Conference PlaceNew Orleans, USA
AbstractIn this paper, we target improving the accuracy of acoustic modelling for statistical parametric speech synthesis (SPSS) and introduce the convolutional neural network (CNN) due to its powerful capacity in locality modelling. A novel model architecture combining unidirectional long short-term memory (LSTM) and a time-domain convolutional output layer (COL) is proposed and employed to acoustic modelling. The two components complement each other and result in a high-performance synthesis system. Specifically, the unidirectional LSTM can learn expressive feature representations from history context and the COL ingeniously absorbs some of these representations within a look-ahead window to advance predictions. This complementary mechanism significantly improve the predictive accuracy and the quality of synthetic speech. In addition, the unique operation mechanism of convolution makes COL a fine parameter trajectory smoother between consecutive frames. Subjective preference tests show that the proposed architecture can synthesize natural sounding speech without dynamic features.
KeywordStatistical Parametric Speech Synthesis Lstm Convolutional Output Layer High-performance Trajectory Smoother
Indexed ByEI
Language英语
Document Type会议论文
Identifierhttp://ir.ia.ac.cn/handle/173211/19660
Collection数字内容技术与服务研究中心_听觉模型与认知计算
AffiliationInstitute of Automation, Chinese Academy of Sciences, Beijing, China
Recommended Citation
GB/T 7714
Wang, Wenfu,Xu, Bo. COMBINING UNIDIRECTIONAL LONG SHORT-TERM MEMORY WITH CONVOLUTIONAL OUTPUT LAYER FOR HIGH-PERFORMANCE SPEECH SYNTHESIS[C],2017:5500-5504.
Files in This Item: Download All
File Name/Size DocType Version Access License
icassp2017_wang.pdf(228KB)会议论文 开放获取CC BY-NC-SAView Download
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Wang, Wenfu]'s Articles
[Xu, Bo]'s Articles
Baidu academic
Similar articles in Baidu academic
[Wang, Wenfu]'s Articles
[Xu, Bo]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Wang, Wenfu]'s Articles
[Xu, Bo]'s Articles
Terms of Use
No data!
Social Bookmark/Share
File name: icassp2017_wang.pdf
Format: Adobe PDF
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.