CASIA OpenIR  > 数字内容技术与服务研究中心  > 听觉模型与认知计算
Word-level Permutation and Improved Lower Frame Rate for RNN-Based Acoustic Modeling
Yuanyuan Zhao1,2; Shiyu Zhou1,2; Shuang Xu1; Bo Xu1
2017
Conference Nameiconip2017
Source Publicationiconip2017
Pages859-869
Conference DateNovember 14-18, 2017
Conference PlaceGuangzhou, China
AbstractRecently, the RNN-based acoustic model has shown promising performance. However, its generalization ability to multiple scenarios is not powerful enough for two reasons. Firstly, it encodes inter-word dependency, which conflicts with the nature that an acoustic model should model the pronunciation of words only. Secondly, the RNN-based acoustic model depicting the inner-word acoustic trajectory frame-by-frame is too precise to tolerate small distortions. In this work, we propose two variants to address aforementioned two problems. One is the word-level permutation, i.e. the order of input features and corresponding labels is shuffled with a proper probability according to word boundaries. It aims to eliminate inter-word dependencies. The other one is the improved LFR (iLFR) model, which equidistantly splits the original sentence into N utterances to overcome the discarding data in LFR model. Results based on LSTM RNN demonstrate 7\% relative performance improvement by jointing the word-level permutation and iLFR.
KeywordRnn-based Acoustic Model Acoustic Trajectory Lower Frame Rate Word-level Permutation
Indexed ByEI
Language英语
Document Type会议论文
Identifierhttp://ir.ia.ac.cn/handle/173211/15429
Collection数字内容技术与服务研究中心_听觉模型与认知计算
Corresponding AuthorYuanyuan Zhao
Affiliation1.Institute of Automation, Chinese Academy of Sciences
2.University of Chinese Academy of Sciences
First Author AffilicationInstitute of Automation, Chinese Academy of Sciences
Corresponding Author AffilicationInstitute of Automation, Chinese Academy of Sciences
Recommended Citation
GB/T 7714
Yuanyuan Zhao,Shiyu Zhou,Shuang Xu,et al. Word-level Permutation and Improved Lower Frame Rate for RNN-Based Acoustic Modeling[C],2017:859-869.
Files in This Item: Download All
File Name/Size DocType Version Access License
Word-Level Permutati(513KB)会议论文 开放获取CC BY-NC-SAView Download
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Yuanyuan Zhao]'s Articles
[Shiyu Zhou]'s Articles
[Shuang Xu]'s Articles
Baidu academic
Similar articles in Baidu academic
[Yuanyuan Zhao]'s Articles
[Shiyu Zhou]'s Articles
[Shuang Xu]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Yuanyuan Zhao]'s Articles
[Shiyu Zhou]'s Articles
[Shuang Xu]'s Articles
Terms of Use
No data!
Social Bookmark/Share
File name: Word-Level Permutation and Improved Lower Frame Rate for RNN-Based Acoustic Modeling.pdf
Format: Adobe PDF
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.