面向限定领域语音识别的汉语语言模型研究

CASIA OpenIR > 毕业生 > 硕士学位论文

	面向限定领域语音识别的汉语语言模型研究
其他题名	Chinese Language Modeling for Speech Regonition In a Limited Domain
	吕振宇
	2010-05-31
学位类型	工学硕士
中文摘要	语言模型在语音识别系统中扮演着不可替代的作用。语音识别技术正在不断发展和实用化，识别任务面向越来越广泛的领域并拥有更为复杂的背景，这就要求提供高性能的语言模型来减少搜索空间，从而提高识别速度和精度。而在限定领域尤其是在口语语音识别中，经常会出现训练语料不易获得的情况，这就需要我们针对训练语料的特点来训练语言模型。本文主要完成了以下几方面的工作：在广泛了解语言模型的学术前沿现状基础上，研究各种语言模型建模方法并进行性能比较。构建汉语语言模型的初始模型：首先对语料进行预处理，使用新词发现算法对语料中的新词进行提取，并加入到已有的词典中；然后对语料进行分词，使用各种语言模型建模算法建立汉语语言模型并比较其性能。在平滑算法方面，提出了基于层次类回退差值的语言模型平滑算法。这种方法利用一棵词类树，在估计N元文法概率的时候将纵向的回退与横向的插值结合起来，优先使用更为具体的语境，并同时有效利用低阶文法信息。实验表明，其在困惑度上比基于词类的语言模型降低了8.9%。提出了一种新的语言模型自适应框架。在此框架下，通过改变背景语料结构和改变背景模型结构来得到最终的自适应模型。实验证明，新的框架下Linear Interpolation (LI)、Minimum Discrimination Information(MDI)方法均优于通用框架下的相应方法，在困惑度上比通用框架分别降低了5.2%，36.8% 尝试对少量无特点、限定领域的口语语料，进行语言模型建模研究。首先进行语料扩展，然后利用提出的新自适应框架，采用LI自适应方法，语音识别误识率比通用框架下的LI自适应方法降低了0.7%（绝对点）。
英文摘要	Language model plays an irreplaceable role in speech recognition systems. Speech recognition technology is evolving and becomes practical. To fit more and more extensive areas and complex background, high-performance language models are required to reduce the search space, thereby enhancing the recognition speed and accuracy. But in some limited domains, it is difficult to get enough corpus to train the language model. In that case, we should train the model depending on the features of the corpus. The work of this paper is as follows: We learned the current situation of language modeling and then studied some algorithms of language modeling. Then we tried to build Chinese language models: first preprocessing the corpus, including special characters processing, Chinese word segmentation, new word detection, and then used certain algorithm to build Chinese language models. We proposed a novel interpolated language model that combines the interpolation and the backing-off along hierarchical classes based on class hierarchy. We used a cluster-tree to balance the generalization ability of classes and word specificity when estimating the likelihood of an n-gram event, and used interpolation to apply lower order gram information.. We presented a new framework of language model adaptation based on modification of structures of background corpus and language model. The widely used adaptation approach such as Linear Interpolation Method (LI) and Minimum Discrimination Information (MDI) method are used as the approaches to modify structure of trained background language model in new framework, while Maximum A Posteriori approach (MAP) is used as the method of modifying structure of background corpus. Experiments were shown that both techniques in the framework yield a significant reduction in perplexity over LI and MDI method in general adaptation framework about 5.2%, and 36.8% respectively. We attempted to build language model for small, domain-dependent but charactless Chinese corpus. First, we extended the corpus artificially, and then used LI method in the new adaptive framework proposed. The word error of speech recognition was reduced by 0.7%(absolute point).
关键词	语言模型平滑语音识别口语自适应限定领域 Language Model Smooth Speech Recognition Spoken Language Adaptive Limited Domain
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/7514
专题	毕业生_硕士学位论文
推荐引用方式 GB/T 7714	吕振宇. 面向限定领域语音识别的汉语语言模型研究[D]. 中国科学院自动化研究所. 中国科学院研究生院,2010.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
CASIA_20072801462806（807KB）			限制开放	CC BY-NC-SA