CASIA OpenIR  > 毕业生  > 博士学位论文
数字图像中文字检测抽取与退化字符识别
其他题名Text Detection and Extraction in Digital Image & Degraded Character Recognition
刘春梅
2006-05-29
学位类型工学博士
中文摘要字符识别的研究近年来取得重要进展,目前字符识别方法可以很好地处理背景干净、清晰的字符图像,但对于复杂背景下的、低质量图象的退化字符识别并未获得满意的解决方法。日前有两大问题成为字符识别研究的难点和应用的瓶颈:一是复杂背景下的字符识别,这就需要图像中文字自动检测和文字提取系统将文字从复杂背景中检测提取出来,送入OCR系统进行识别;二是低质量图象的退化字符识别问题,字符图像中存在的字迹模糊、笔画粘连、断裂、分辨率低等退化情况,都大大增加了字符识别难度,这就需要从理论和方法技术上给与有效的解决办法。本文围绕着图像中文字检测和低质量退化字符识别问题开展了相关的研究工作,主要的研究工作包括: 1.在综合集成方法论的指导下,提出了基于多种特征集成型图像中文字检测方法,并建立了相应的图像中文字自动检测系统。根据文字的多种特征,将多种特征融合集成,提出基于多种特征集成型的文字检测方法,适应各种复杂图像中的文字检测,从而提高图像中文字检测系统性能,这里用到了颜色特征、边缘特征、纹理特征及文字本身的一些特征;并且根据文字特征和计算的复杂度,设计多级文字检测器,将多级文字检测器有效地集成连接,每级文字检测器根据上级检测结果选择适合的特征,并用相关处理方法进行检测,逐级修正和精确检测结果,有效地防止漏检、误检现象的发生,提高系统性能,增强系统的稳定性。 2.针对低质量退化字符识别问题,提出了一种字符图像分辨率质量判别方法,并建立了相应的字符图像分辨率判定系统。对不同分辨率图像质量的字符图像,提出了灰度分布特征,基于这种灰度分布特征对各个图像质量级别的字符图像进行分辨率图像质量判定。这种方法计算简单,无需通过与清晰图像作对比,只需通过对训练样本分辨率图像质量学习,就可有效地对输入字符图像分辨率的质量进行判断。 3.本文将字符图像分辨率质量判定方法应用在多分辨率退化字符识别上,提出了多分辨率自适应退化字符识别方法,并建立相应多分辨率退化字符识别系统。本文将图像质量信息融进字符识别过程,采用集成型模式识别技术,构建多个分类器集成的网络弥补单个分类器识别率低、稳定性差的缺陷,提出了自适应识别分类算法,使识别率达到一个比较高的水平,初步解决低分辨率图像中退化字符识别的若干理论和技术问题。本文以智能理论和综合集成的构思为基础,开展多分辨率退化字符识别技术的研究工作,在国内是创新的,在国际是前沿的。这项工作的开展只是探索性的一小步,目前还处于研究的探索阶段。
英文摘要In the filed of character recognition, researchers have achieved great success in the recent years. The current optical character recognition (OCR) system only can handle the texts which are separated from the background and transformed to a binary image. When facing with complex background, low-quality image, they usually achieve poor performance. There exist two challenging problems, which are bottlenecks for enhancing rate of the character recognition. One is character recognition with complex background. It is necessary that the automatic system of text detection detect and extract text from the image which can be recognized by OCR. The other is degraded character recognition in low-quality. It is very difficult for character recognition that blurring, conglutination, broken stroke, low resolution exist in image. It is necessary to find a good way to resolve above two problems. Aiming at this goal, this dissertation involves with the following aspects: (1) By the direction of Metasynthesis theory, this dissertation proposes a method of text detection in image based on integration of multi-features, which integrates multi-features according to text multi-characteristic, and designs an automatic system of text detection in image. (2) In order to resolve the character recognition in low-quality image, this dissertation proposes a method of resolution evaluation of character image, based on gray distribution feature, and builds up the system of resolution evaluation of character image. Experiment result demonstrates the proposed approach can effectively evaluate the resolution quality of character image. (3) This dissertation applies the resolution information of character image on the degraded character recognition with multi-reolution. By integration technique of pattern recognition, the proposed method complements the low recognition rate and bad stability of single classifier to build up integrated net of multi-classifiers. The dissertation proposes a method of adaptive character recognition for degraded character recognition with multi-resolution, and designs the system of degraded character recognition with multi-resolution.
关键词文字检测 退化字符识别 图像质量评价 综合集成 Text Detection In Image Degraded Character Recognition Image Quality Evaluation Metasynthesis
语种中文
文献类型学位论文
条目标识符http://ir.ia.ac.cn/handle/173211/5920
专题毕业生_博士学位论文
推荐引用方式
GB/T 7714
刘春梅. 数字图像中文字检测抽取与退化字符识别[D]. 中国科学院自动化研究所. 中国科学院研究生院,2006.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
CASIA_20031801460301(12772KB) 暂不开放CC BY-NC-SA
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[刘春梅]的文章
百度学术
百度学术中相似的文章
[刘春梅]的文章
必应学术
必应学术中相似的文章
[刘春梅]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。