脱机手写中文文本识别方法研究

CASIA OpenIR > 毕业生 > 博士学位论文

	脱机手写中文文本识别方法研究
其他题名	A Study on Handwritten Chinese Text Recognition
	王秋锋
	2012-05-31
学位类型	工学博士
中文摘要	随着个人电子计算机的迅速发展和各种数字化设备的出现，越来越多的纸质文档需要被转为电子文档，以便更好的查询、编辑、传输。尽管手写汉字识别的研究已经展开了近半个世纪，也取得了巨大的成就，但是这些研究普遍集中于单字识别或书写工整的识别。直到最近，自由书写的手写中文文本识别才逐渐受到重视，但是从现有的工作来看，其方法还很不成熟，识别性能也非常低下。鉴于此，本文对脱机手写中文文本识别进行了深入的研究，主要包括识别过程中的路径评价准则、路径搜索算法、语言模型自适应和常识知识的融入。路径评价准则和路径搜索算法是设计一个文本识别系统的关键因素；语言模型自适应和常识知识的融入都是为了进一步提高路径评价准则中语言上下文模型的准确性，从而提高文本识别系统的性能。本文的主要贡献如下：（1）基于贝叶斯决策（Bayesian decision）理论，给出了一种基于统计的路径评价准则。该准则融合了多种上下文信息，包括字符分类器信息、几何上下文信息、语言上下文信息，同时对分类器的输出采用置信度转换的方式近似得到后验概率；另外，为克服路径长度的影响提出了多种修正方法；最后，为了更好的平衡多种上下文模型，给出了一种字符准确率最大化学习准则，自动学习模型之间的融合权重。（2）为了快速准确搜索到最优切分-识别路径，本文提出了一种精简的集束搜索（refined beam search）算法。该算法将其剪枝策略分为两步，这样有效地提高了保留路径的精度；同时，在搜索的过程中，动态地利用字符分类混淆信息和语言上下文信息来补充候选字符类别，进一步提高搜索的正确性。（3）为了更好地提高路径评价准则中语言模型的领域适应性，本文实现了一种动态的非监督语言模型自适应方法，利用准备好的一个关于多种领域的语言模型集合，通过两遍识别策略（Two-pass recognition）来选择最佳匹配的语言模型，以此进一步提高路径评价准则的领域相关性。另外，为了克服语言模型存储空间的问题，我们提出了一种多策略的语言模型压缩方法，包括针对其本身语言模型的压缩和利用主成分分析（PCA）的方法对多个相关语言模型进行整体压缩，最后得到一个存储空间较小的语言模型集合。（4）为了使得计算机能和人类一样正确地识别文档，本文尝试了一种将常识知识融入到文档识别系统的方法，包括嵌入式模型、独立模型以及组合模型这三种常识知识的融入方式。嵌入式模型将常识知识嵌入到普通语言模型的概率估计之中，以此提高其概率估计的可靠性，特别是那些语料库中未出现的N元组概率；独立模型则将常识知识作为一种独立的语言上下文模型融入到识别系统中，以此来弥补普通N元语言模型的短距离上下文限制；组合模型则是将上述两种方式结合。
英文摘要	With the development of computing technology and the emergence of a variety of digitization devices, increasing paper documents are needed to transformed to electronic form, to facilitate convenient search, editing and transmission. Handwritten Chinese character recognition has been studied for nearly a half century and has achieved tremendous advances. However, most works were focused on isolated character recognition or constrained handwriting recognition. Only in recent years, the analysis of unconstrained handwritten documents is receiving intensive efforts, and the reported performance is very low. This thesis presents a systematic work on handwritten Chinese text recognition, considering four major issues: path evaluation criterion, path search algorithm, language model adaptation and integrating common sense knowledge. The path evaluation and search techniques are crucial for handwritten text recognition. To achieve higher recognition performance, we introduce language model adaptation and integrate the common sense knowledge. The main contributions of this work are as follows: 1. From the Bayesian decision view, we proposed a statistical path evaluation function. This function integrates multiple contexts, including character classification, geometric context and linguistic context, and we also convert the classifier outputs to posterior probabilities via confidence transformation. To overcome the bias of path length, we present several heuristic modified path evaluation criteria. Moreover, to balance the effect of different context models, the combing weights are optimized by a supervised parameters learning method using a maximum character accuracy criterion. 2. To find out the path with maximum path evaluation score efficiently, we present a refined beam search algorithm by splitting the pruning strategy into two stages. And meanwhile, we improve the candidate path completeness by augmenting characters based on the character confusion information and the linguistic context. 3. To overcome the mismatch between the given language model and the recognition text, we present a dynamic and unsupervised language model adaptation method. Using a two-pass recognition strategy, we choose the best matched language model from a pre-defined language model set to improve the recognition performance further. Meanwhile, to overcome the storage problem of the pre-defined language model set, we compress each language mode to a moderate size using two metho...
关键词	手写文本识别路径评价准则路径搜索算法语言模型自适应常识知识 Handwritten Text Recognition Path Evaluation Criterion Path Search Method Language Model Adaptation Common Sense Knowledge
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/6465
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	王秋锋. 脱机手写中文文本识别方法研究[D]. 中国科学院自动化研究所. 中国科学院研究生院,2012.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
CASIA_20081801462806（13725KB）			暂不开放	CC BY-NC-SA