HMM-Based语音识别新算法的研究

CASIA OpenIR > 毕业生 > 博士学位论文

	HMM-Based语音识别新算法的研究
	李功俊
	1996-04-01
学位类型	工学博士
中文摘要	论文全面、系统地介绍了国内外语音识别的发展历史和发展现状，结合相关学科的知识，分析了语音声学特征、语音学特征、音系学特征及其语音感知等有关问题，分析并总结了目前语音识别理论和技术实现方面的研究重点以及目前、甚至在相当长的时期内语音识别面临的挑战。攻读博士期间，作者的研究覆盖听觉模型、语音特征分析和特征选择、声学一语音学建模、模型训练准则和训练算法以及语音识别算法等语音识别中相当重要的研究课题，并针对多个研究主题中存在的一些问题，提出了一系列思路富于创造性、且对提高语音识别率行之有效的算法，发展和完善了与语音识别上述研究主题相关的理论与方法。声学-语音学建模是语音识别的关键环节，在该研究主题上，作者： 1．系统地分析和总结了现有声学一语音学建模的两种主要方式--基元HMM模型层次和构造基元的语音学层次建模，提出了声学-语音学符号体系等新概念，以改变了声学一语音学建模的传统模式。并通过对一个声学-语音学建模的具体构想的分析，阐述了建立声学-语音学符号体系的必要性及其优点--有机地将声学一语音学符号体系间的转换和评价转换的数学模型结合起来，提高声学一语音学建模的准确性。 2．指出：传统声学-语音学建模方法中，词模板通常由基元模板简单地串接而成：在 Viterbi解码中，由于声母基元驻留时间短，它对音节总体匹配得分的贡献小于韵母基元；另外，在协同发音效应严重的过渡段，声、韵母的语音学特征相互扩展、相互重叠，并非简单的串接。为此作者握出了一种新的模型--基元模板耦合模型：在语音过渡段，任何观察的输出均为两个相接基元模板共同作用的结果。基元模型间的耦合使得基元模型能共享语音过渡段，从而声母基元的驻留时间相对增加，且模型对协同发音现象的描述也更加准确。Baum-Welch算法大的修改即可用于上述模型的参数估计。实验结果表明：基元模板耦合模型能明显地提高语音识别率。模型训练和参数估计是语音识别算法实现的核心问题，作者： 1．分析：由于训练语料不同的时长，Baum辅助函数的引入使得时长较长的语料具有较小的输出概率似然，导致不同时长的训练语料对模型训练的贡献不同。为此，作者提出了三解决训练语料问题的方案： (i)输出概率似然的几何平均算法：每个训练语料的输出概率似然关于各自的时长取几何平均，并以此作为模型参数估计的目标函数。 (ii)基于路径输
英文摘要	The history and current situation of research in speech recognition are proposed all-sidedly and systematically. We probe into such topics about acoustic features, phonetic features, phonologic features and auditory perception in terms of related theory and disciplines. Furthermore, we summarize and analyze focus in current theoretical and technical aspects of speech recognition and immediately put an insight into challenge to speech recognition presently, even in a long time. During my studies for Ph D degree, my research interests cover such important subjects in speech recognition as auditory model, speech features analysis and selection, acoustic-phonetic modeling, paradigms for HMM training and corresponding algorithms, and new approaches to speech recognition. To combat the problems we address in these subjects, we present some original approaches effective for improving recognition rate and make some theoretical research in these subjects. On acoustic-phonetic modelmg, a key circle in speech recognition, we 1. Systematically analyze and summarize two sorts of approaches--acoustic-phonetic modeling based on the structure of unit HMMs and phonetic units and then propose such new terms as acoustic- phonetic symbol system to break through conventional framework. A specific conception based on the system illustrates the importance and necessity of the proposed acoustic-phonetic symbol system in speech recognition. In addition, the analysis gives that this system makes it possible to organically merge symbol-based transform among intermediate sets of the system, acoustic events and linguistic units, and to improve acoustic-phonetic modeling accuracy further. 2. Point out: in conventional acoustic-phonetic modeling approaches, a word template can be obtained by simply cascading unit templates. In Viterbi decoding process, consonant units contribute less to the gross matching score than vowel units due to their short duration. In transition segment with serious co-articulation effect phonetic features of consonants and vowels extend and overlap. To combat the above problems, we propose a new model, that is, unit-template-coupled model. In speech transition segment any observation is generated by coupled unite templates. The unit- template-coupled model makes it possible for all unit template to share the transition segment and thus duration of consonants is relatively lengthened. Additionally, the model can depict the co- articulation effect in the articulation more accurately. No major revision of Baum-Welch algorithm is required to estimate the desired parameters. The experimental result gives that the error rate is decreased remarkably in coupled-model-based speech recognition. On HMM training and estimation of parameters, a kernel problem in speech recognition, we 1. Analyze: due to different duration of speech materials for training, Baum auxiliary function results
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/5660
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	李功俊. HMM-Based语音识别新算法的研究[D]. 中国科学院自动化研究所. 中国科学院自动化研究所,1996.