CASIA OpenIR  > 毕业生  > 硕士学位论文
基于HMM的脱机自由手写英文单词识别系统
其他题名An HMM-Based Off-line Cursive English Word Recognition System
梁佳玉
学位类型工学硕士
导师刘迎建
2004-07-01
学位授予单位中国科学院研究生院
学位授予地点中国科学院自动化研究所
学位专业模式识别与智能系统
关键词Hmm 手写单词识别 Lvq Hidden Markov Model Off-line Words Recognition Lvq
摘要利用HMM(Hidden Markov Model,隐马尔可夫)进行脱机自由 手写英文单词识别,目前仍是文字识别领域的研究热点。不仅对 HMM理论本身,还是HMM应用方面都有很大的促进作用。而且, 脱机英文单词的识别无论在金融领域、邮政领域,还是信息的转化、 传递等等各方面都有着广阔的应用前景,因而吸引了很多的学者投 入其中。 本文首先介绍了近年来HMM在脱机手写英文单词识别中的应 用,然后在此基础上,设计并实现了一个脱机自由手写英文单词的 识别系统。该系统基于HMM,并采用两级识别加验证方式。以下就 是本文的主要工作: 1、在预处理阶段,进行了图像的二值化、去噪声、倾斜校正, 以及参考线的提取等等。在参考线提取过程中,本文除了利 用直方图外,还将它与垂直方向上水平穿透数的变化结合起 来,可以相对准确的找到参考线; 2、本文使用了两组特征,各组特征都通过滑动窗的方式提取。 由于宽度固定的滑动窗对书写风格的依赖性很强,因此,本 文根据水平穿透数目动态确定滑动窗的宽度,不仅避免了宽 度的经验取值,也在一定程度上克服了书写风格的差异造成 的影响; 3、在HMM识别阶段,本文采用模糊分割方式,单词模型由字 母模型线性连接而成。由于字母本身的宽度不同,本文字母 模型的状态数也不完全相同; 4、本文的系统采用两级识别加验证的模式,第一级在利用HMM 识别的同时,由Viterbi算法回溯得到前三个候选的最佳分 割。并将结果送到第二级,与LVQ算法生成的参考点进行匹 配验证。然后将两次识别的结果集成,得到最终的输出; 为了验证系统的有效性,本文在NIST和Cambridge两套样本库上分别进行了测试,对多书写者和同一书写者这两种典型情况,结 果都比较令人满意。
其他摘要HMM-based off-line English cursive word recognition, which advances the development of both HMM theory and its applications, is still the highlight in the OCR field. On the other hand, the recognition techniques have many applications such as in the field of the postal address recognition, bank check reading, generic content recognition and so on. For the two reasons above, English cursive word recognition has been the focus of researchers all over the world. Firstly, we introduced and analyzed the applications of HMM in the off-line cursive English word recognition. Secondly, based on the introduction and the analysis, we designed and realized an HMM-based recognition system, which is a two-pass system recognition plus verification. The main contributions of this paper are as follow: 1、 In the step of preprocessing, the technology of image binary, noise removing, normalization and reference line detection are used. Since it is difficult to find reference line correctly depending on profile only, we integrate transition with profile. As a result, we can relatively detect the correct reference. 2、In the thesis, we used two groups of features. Each feature group is taken by means of sliding window. Although the width of sliding window is generally fixed or takes average value by current samples, the writing style always affects a lot. So we dynamic the width and overcome the effect of writing style in a certain extent. 3、 In the step of HMM-based recognition, we used implicit-based segment method. Word model is made of character model by linear connecting. For the difference of each character, the count of state in each model is unequal. 4、 The system we designed in this paper adopted two-pass structure. The first pass is HMM-based recognition. As the process of recognition is going on. the best segments of the top three candidates are traced through Viterbi algorithm. Then the result is sent to the second pass. The verification is performed through matching with reference points found by LVQ. At last we integrate these two results and get the final recognition output. To verify the system of this paper, two tests are taken in the samples of NIST and Cambridge respectively. The first sample is written by 500 persons, the second is written only by 1 person. The result is essentially satisfying.
馆藏号XWLW784
其他标识符784
语种中文
文献类型学位论文
条目标识符http://ir.ia.ac.cn/handle/173211/6807
专题毕业生_硕士学位论文
推荐引用方式
GB/T 7714
梁佳玉. 基于HMM的脱机自由手写英文单词识别系统[D]. 中国科学院自动化研究所. 中国科学院研究生院,2004.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[梁佳玉]的文章
百度学术
百度学术中相似的文章
[梁佳玉]的文章
必应学术
必应学术中相似的文章
[梁佳玉]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。