CASIA OpenIR  > 模式识别国家重点实验室
低质图像文本识别方法研究
许铭潮
2020-12
页数74
学位类型硕士
中文摘要

文本识别(从图像中识别文字并转换为数字代码)具有广泛的应用需求。近 年来,随着深度学习的兴起和发展,文本识别算法在创新性、实用性和效率等方 面都有明显的提升。但是,这些识别算法大多是针对高质量的文本图片。在实际 应用中,光照不均匀,相机焦距差异、拍摄设备抖动等问题都会造成不同程度的 图像失真和模糊。这些低质图像会造成识别的精度损失,无法满足实际应用的需 求。因此,本文研究低质图像文本识别方法,主要利用超分辨率算法对低质文本 图像进行恢复,从而改善识别器的性能。主要工作内容分为以下两部分:

1. 面向文本识别对多种超分辨率算法进行了评价和改进。首先,在低质场 景文本图像数据集 TextZoom 中比较了 10 种前沿超分辨率算法的性能,并使用 三种识别算法(ASTER、MORAN、CRNN)来测试生成图像的识别精度。在此基 础上引入了空间变换网络和梯度剖面损失来提升各个超分辨率算法的生成效果。 其次,本文提出了一种低质文本图像生成优化算法。该算法基于识别器的反传梯 度指导生成器进行学习,从而改善识别效果,通过固定识别器参数以及引入识别 损失,进一步提升了识别器精度,有效地缓解了低质图像文本识别困难的问题。

2. 提出了一个基于超分辨率和生成对抗网络的文本识别框架——SRR-GAN。 该框架对传统的级联方案(图像超分和文本识别分步进行)进行了改进,在对抗 学习的框架下,将文本识别任务和超分任务集成起来。通过对识别模型和超分辨 率模型联合训练,该框架可以使神经网络在不同分辨率图片中学习到更通用的 特征,进而对不同分辨率图像都能保持较高的识别精度。

英文摘要

Text carries rich and accurate semantic information, which is very important in many visual application scenarios. Therefore, text recognition has always been an ac- tive research topic in the field of computer vision and pattern recognition. In recent years, with the development of deep learning, numerous text recognition algorithms have shown novelty, practicality and efficiency. However, these algorithms mainly fo- cus on high quality text images. Text images can be distorted in many application sce- narios. Nonuniform illumination, camera defocus, motion blur from camera shake and low resolution can lead to low quality text images. The low quality text image will de- grade the recognition accuracy. This thesis studies recognition methods for low quality text image, by using super-resolution algorithms for image restoration, so as to improve the recognition performance. The main contributions lie in the following two aspects:

1. Several super-resolution algorithms for text recognition are evaluated and im- proved. Firstly, we compare 10 classical super-resolution algorithms on TextZoom, a low-quality scene text image dataset, and three recognition algorithms (ASTER, MORAN, CRNN) are used to test the restored image. We further use the Spatial Transformer Net- work (STN) and gradient profile loss to improve the restored image quality of super- resolution algorithm. Also, we propose to improve the recognition result based on the gradient of recognition loss. By fixing the parameters of the recognizer and introducing the recognition loss, the recognition accuracy can be improved further. The effective- ness of these techniques are verified in experiments.

2. A novel framework named SRR-GAN (Super-Resolution based Recognition with Generative Adversarial Networks), which is based on super-resolution and ad- versarial learning, is proposed. The proposed framework improves existing methods which adopt the cascade scheme by integrating text recognition with super-resolution via adversarial learning. Through the joint training of recognition and super-resolution models, we can learn more general features of images with various quality, so as to yield higher recognition performance for both high-resolution and low-resolution images.

关键词低质图像文本识别,超分辨率,空间变换网络,梯度剖面损失,对抗学习
语种中文
文献类型学位论文
条目标识符http://ir.ia.ac.cn/handle/173211/43339
专题模式识别国家重点实验室
推荐引用方式
GB/T 7714
许铭潮. 低质图像文本识别方法研究[D]. 中国科学院自动化研究所. 中国科学院自动化研究所,2020.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
许铭潮毕业论文V9.pdf(4602KB)学位论文 开放获取CC BY-NC-SA
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[许铭潮]的文章
百度学术
百度学术中相似的文章
[许铭潮]的文章
必应学术
必应学术中相似的文章
[许铭潮]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。