CASIA OpenIR  > 模式识别国家重点实验室  > 语音交互
CTC Regularized Model Adaptation for Improving LSTM RNN Based Multi-Accent Mandarin Speech Recognition
Yi, Jiangyan1,2; Wen, Zhengqi1; Tao, Jianhua1,2,3; Ni, Hao1,2; Liu, Bin1; Wen ZQ(温正棋)
2018-07-01
发表期刊JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY
卷号90期号:7页码:985-997
文章类型Article
摘要This paper proposes a novel regularized adaptation method to improve the performance of multi-accent Mandarin speech recognition task. The acoustic model is based on long short term memory recurrent neural network trained with a connectionist temporal classification loss function (LSTM-RNN-CTC). In general, directly adjusting the network parameters with a small adaptation set may lead to over-fitting. In order to avoid this problem, a regularization term is added to the original training criterion. It forces the conditional probability distribution estimated from the adapted model to be close to the accent independent model. Meanwhile, only the accent-specific output layer needs to be fine-tuned using this adaptation method. Experiments are conducted on RASC863 and CASIA regional accented speech corpus. The results show that the proposed method obtains obvious improvement when compared with LSTM-RNN-CTC baseline model. It also outperforms other adaptation methods.
关键词Multi-accent Mandarin Speech Recognition Lstm-rnn-ctc Model Adaptation Ctc Regularization
WOS标题词Science & Technology ; Technology
DOI10.1007/s11265-017-1291-1
收录类别SCI
语种英语
项目资助者National High-Tech Research and Development Program of China (863 Program)(2015AA016305) ; National Natural Science Foundation of China (NSFC)(61425017 ; Strategic Priority Research Program of the CAS(XDB02080006) ; Major Program for the National Social Science Fund of China(13ZD189) ; 61403386 ; 61305003)
WOS研究方向Computer Science ; Engineering
WOS类目Computer Science, Information Systems ; Engineering, Electrical & Electronic
WOS记录号WOS:000433555600004
引用统计
文献类型期刊论文
条目标识符http://ir.ia.ac.cn/handle/173211/19882
专题模式识别国家重点实验室_语音交互
通讯作者Wen ZQ(温正棋)
作者单位1.Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing, Peoples R China
2.Univ Chinese Acad Sci, Sch Comp & Control Engn, Beijing, Peoples R China
3.Chinese Acad Sci, Inst Automat, CAS Ctr Excellence Brain Sci & Intelligence Techn, Beijing, Peoples R China
推荐引用方式
GB/T 7714
Yi, Jiangyan,Wen, Zhengqi,Tao, Jianhua,et al. CTC Regularized Model Adaptation for Improving LSTM RNN Based Multi-Accent Mandarin Speech Recognition[J]. JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY,2018,90(7):985-997.
APA Yi, Jiangyan,Wen, Zhengqi,Tao, Jianhua,Ni, Hao,Liu, Bin,&温正棋.(2018).CTC Regularized Model Adaptation for Improving LSTM RNN Based Multi-Accent Mandarin Speech Recognition.JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY,90(7),985-997.
MLA Yi, Jiangyan,et al."CTC Regularized Model Adaptation for Improving LSTM RNN Based Multi-Accent Mandarin Speech Recognition".JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY 90.7(2018):985-997.
条目包含的文件 下载所有文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
10.1007_s11265-017-1(1416KB)期刊论文作者接受稿开放获取CC BY-NC-SA浏览 下载
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Yi, Jiangyan]的文章
[Wen, Zhengqi]的文章
[Tao, Jianhua]的文章
百度学术
百度学术中相似的文章
[Yi, Jiangyan]的文章
[Wen, Zhengqi]的文章
[Tao, Jianhua]的文章
必应学术
必应学术中相似的文章
[Yi, Jiangyan]的文章
[Wen, Zhengqi]的文章
[Tao, Jianhua]的文章
相关权益政策
暂无数据
收藏/分享
文件名: 10.1007_s11265-017-1291-1.pdf
格式: Adobe PDF
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。