CASIA OpenIR  > 数字内容技术与服务研究中心  > 听觉模型与认知计算
Word-level Permutation and Improved Lower Frame Rate for RNN-Based Acoustic Modeling
Yuanyuan Zhao1,2; Shiyu Zhou1,2; Shuang Xu1; Bo Xu1
2017
Conference Nameiconip2017
Source Publicationiconip2017
Pages859-869
Conference DateNovember 14-18, 2017
Conference PlaceGuangzhou, China
AbstractRecently, the RNN-based acoustic model has shown promising performance. However, its generalization ability to multiple scenarios is not powerful enough for two reasons. Firstly, it encodes inter-word dependency, which conflicts with the nature that an acoustic model should model the pronunciation of words only. Secondly, the RNN-based acoustic model depicting the inner-word acoustic trajectory frame-by-frame is too precise to tolerate small distortions. In this work, we propose two variants to address aforementioned two problems. One is the word-level permutation, i.e. the order of input features and corresponding labels is shuffled with a proper probability according to word boundaries. It aims to eliminate inter-word dependencies. The other one is the improved LFR (iLFR) model, which equidistantly splits the original sentence into N utterances to overcome the discarding data in LFR model. Results based on LSTM RNN demonstrate 7\% relative performance improvement by jointing the word-level permutation and iLFR.
KeywordRnn-based Acoustic Model Acoustic Trajectory Lower Frame Rate Word-level Permutation
Indexed ByEI
Language英语
Document Type会议论文
Identifierhttp://ir.ia.ac.cn/handle/173211/15429
Collection数字内容技术与服务研究中心_听觉模型与认知计算
Corresponding AuthorYuanyuan Zhao
Affiliation1.Institute of Automation, Chinese Academy of Sciences
2.University of Chinese Academy of Sciences
Recommended Citation
GB/T 7714
Yuanyuan Zhao,Shiyu Zhou,Shuang Xu,et al. Word-level Permutation and Improved Lower Frame Rate for RNN-Based Acoustic Modeling[C],2017:859-869.
Files in This Item: Download All
File Name/Size DocType Version Access License
Word-Level Permutati(513KB)会议论文 开放获取CC BY-NC-SAView Download
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Yuanyuan Zhao]'s Articles
[Shiyu Zhou]'s Articles
[Shuang Xu]'s Articles
Baidu academic
Similar articles in Baidu academic
[Yuanyuan Zhao]'s Articles
[Shiyu Zhou]'s Articles
[Shuang Xu]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Yuanyuan Zhao]'s Articles
[Shiyu Zhou]'s Articles
[Shuang Xu]'s Articles
Terms of Use
No data!
Social Bookmark/Share
File name: Word-Level Permutation and Improved Lower Frame Rate for RNN-Based Acoustic Modeling.pdf
Format: Adobe PDF
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.