CASIA OpenIR  > 毕业生  > 硕士学位论文
儿童语音识别中的关键技术研究
其他题名Research on Key Techniques of Children's Speech Recognition
马瑞堂
学位类型工学硕士
导师李成荣
2007-06-10
学位授予单位中国科学院研究生院
学位授予地点中国科学院自动化研究所
学位专业计算机应用技术
关键词儿童语音识别 儿童语音分析 声道归一化 人机语音交互 Children's Speech Recognition Children's Speech Analysis Vocal Tract Length Normalization Speech Interaction
摘要语音识别技术经过几十年的艰苦探索和研究,已经获得了极大的发展,并开始逐步应用于日常生活中。但语音识别技术中存在的一些问题,特别是儿童语音识别,成为阻碍该技术进一步推广的主要障碍。在我们的系统应用中发现84%的语音数据来自儿童,而成人语音训练的系统用于儿童语音识别时,识别性能会急剧下降。 本文开展的工作主要集中于儿童语音识别中的关键技术研究。概括起来有以下几个方面: 1.分析了儿童语音的特点。在已有的儿童语音数据库基础上,通过对儿童语音基频和共振峰的求取,分析了儿童语音与年龄变化的关系,指出了儿童语音与成人语音存在的差异。 2.研究了儿童语音自适应技术。对男声,女声和混合语音各自训练的模型进行了性能比较,并且将声道长度归一化的说话人自适应技术用于儿童语音识别,在此基础上提出了一种基于比例门限动态调整的办法,使识别率得到了进一步提高。 3.对人机语音交互技术与模块的研究。介绍了DSP平台,识别系统优化和对话管理等相关技术以及交互模块的应用。
其他摘要Speech recognition technique has approached maturity as people spent many years in studying this subject, and it has been employed in our daily life. But disadvantages still exist in the practical applications of speech recognition techniques, especially for recognition of children’s speech. We found that 84% of data recorded by the robot are collected from children. However, recognition experiments using acoustic models trained from adult speech and tested against speech from children show performance degradation clearly. In this paper we focus on the key techniques of children’s speech recognition. There are several aspects in my work: 1.Children’s speech analysis. Based on the children’s speech database, we measured together with the pitch and formant frequencies, analyzed the age effects on children’s speech, and figured out the speech difference between children and adult. 2.Research on children’s speech adaptation techniques. Some recognition experiments have been done using several different acoustics models. One of these models is trained from children’s speech, one is from boys’ speech and another one is from girls’. For improving the performance of children’s speech recognition, a new approach which based on vocal tract length normalization by changing the scale threshold dynamical is introduced. 3.Research on the speech interactive technology and module. This paper introduced embedded system implementation on DSP, optimization techniques, dialogue management and so on. The applications of the module were introduced too.
馆藏号XWLW1095
其他标识符200428014628011
语种中文
文献类型学位论文
条目标识符http://ir.ia.ac.cn/handle/173211/7409
专题毕业生_硕士学位论文
推荐引用方式
GB/T 7714
马瑞堂. 儿童语音识别中的关键技术研究[D]. 中国科学院自动化研究所. 中国科学院研究生院,2007.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
CASIA_20042801462801(1004KB) 暂不开放CC BY-NC-SA请求全文
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[马瑞堂]的文章
百度学术
百度学术中相似的文章
[马瑞堂]的文章
必应学术
必应学术中相似的文章
[马瑞堂]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。