CASIA OpenIR  > 毕业生  > 博士学位论文
HMM-Based语音识别新算法的研究
李功俊
1996-04-01
学位类型工学博士
中文摘要论文全面、系统地介绍了国内外语音识别的发展历史和发展现状,结合相关学科的 知识,分析了语音声学特征、语音学特征、音系学特征及其语音感知等有关问题,分 析并总结了目前语音识别理论和技术实现方面的研究重点以及目前、甚至在相当长 的时期内语音识别面临的挑战。攻读博士期间,作者的研究覆盖听觉模型、语音特 征分析和特征选择、声学一语音学建模、模型训练准则和训练算法以及语音识别算 法等语音识别中相当重要的研究课题,并针对多个研究主题中存在的一些问题,提 出了一系列思路富于创造性、且对提高语音识别率行之有效的算法,发展和完善了 与语音识别上述研究主题相关的理论与方法。 声学-语音学建模是语音识别的关键环节,在该研究主题上,作者: 1.系统地分析和总结了现有声学一语音学建模的两种主要方式--基元HMM模型层 次和构造基元的语音学层次建模,提出了声学-语音学符号体系等新概念,以改 变了声学一语音学建模的传统模式。并通过对一个声学-语音学建模的具体构想 的分析,阐述了建立声学-语音学符号体系的必要性及其优点--有机地将声学一 语音学符号体系间的转换和评价转换的数学模型结合起来,提高声学一语音学建 模的准确性。 2.指出:传统声学-语音学建模方法中,词模板通常由基元模板简单地串接而成:在 Viterbi解码中,由于声母基元驻留时间短,它对音节总体匹配得分的贡献小于韵母 基元;另外,在协同发音效应严重的过渡段,声、韵母的语音学特征相互扩展、相 互重叠,并非简单的串接。为此作者握出了一种新的模型--基元模板耦合模型: 在语音过渡段,任何观察的输出均为两个相接基元模板共同作用的结果。基元模 型间的耦合使得基元模型能共享语音过渡段,从而声母基元的驻留时间相对增 加,且模型对协同发音现象的描述也更加准确。Baum-Welch算法大的修改即可用于 上述模型的参数估计。实验结果表明:基元模板耦合模型能明显地提高语音识别 率。 模型训练和参数估计是语音识别算法实现的核心问题,作者: 1.分析:由于训练语料不同的时长,Baum辅助函数的引入使得时长较长的语料具有 较小的输出概率似然,导致不同时长的训练语料对模型训练的贡献不同。为此, 作者提出了三解决训练语料问题的方案: (i)输出概率似然的几何平均算法:每个训练语料的输出概率似然关于各自的时长 取几何平均,并以此作为模型参数估计的目标函数。 (ii)基于路径输
英文摘要The history and current situation of research in speech recognition are proposed all-sidedly and systematically. We probe into such topics about acoustic features, phonetic features, phonologic features and auditory perception in terms of related theory and disciplines. Furthermore, we summarize and analyze focus in current theoretical and technical aspects of speech recognition and immediately put an insight into challenge to speech recognition presently, even in a long time. During my studies for Ph D degree, my research interests cover such important subjects in speech recognition as auditory model, speech features analysis and selection, acoustic-phonetic modeling, paradigms for HMM training and corresponding algorithms, and new approaches to speech recognition. To combat the problems we address in these subjects, we present some original approaches effective for improving recognition rate and make some theoretical research in these subjects. On acoustic-phonetic modelmg, a key circle in speech recognition, we 1. Systematically analyze and summarize two sorts of approaches--acoustic-phonetic modeling based on the structure of unit HMMs and phonetic units and then propose such new terms as acoustic- phonetic symbol system to break through conventional framework. A specific conception based on the system illustrates the importance and necessity of the proposed acoustic-phonetic symbol system in speech recognition. In addition, the analysis gives that this system makes it possible to organically merge symbol-based transform among intermediate sets of the system, acoustic events and linguistic units, and to improve acoustic-phonetic modeling accuracy further. 2. Point out: in conventional acoustic-phonetic modeling approaches, a word template can be obtained by simply cascading unit templates. In Viterbi decoding process, consonant units contribute less to the gross matching score than vowel units due to their short duration. In transition segment with serious co-articulation effect phonetic features of consonants and vowels extend and overlap. To combat the above problems, we propose a new model, that is, unit-template-coupled model. In speech transition segment any observation is generated by coupled unite templates. The unit- template-coupled model makes it possible for all unit template to share the transition segment and thus duration of consonants is relatively lengthened. Additionally, the model can depict the co- articulation effect in the articulation more accurately. No major revision of Baum-Welch algorithm is required to estimate the desired parameters. The experimental result gives that the error rate is decreased remarkably in coupled-model-based speech recognition. On HMM training and estimation of parameters, a kernel problem in speech recognition, we 1. Analyze: due to different duration of speech materials for training, Baum auxiliary function results
语种中文
文献类型学位论文
条目标识符http://ir.ia.ac.cn/handle/173211/5660
专题毕业生_博士学位论文
推荐引用方式
GB/T 7714
李功俊. HMM-Based语音识别新算法的研究[D]. 中国科学院自动化研究所. 中国科学院自动化研究所,1996.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[李功俊]的文章
百度学术
百度学术中相似的文章
[李功俊]的文章
必应学术
必应学术中相似的文章
[李功俊]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。