低质图像文本识别方法研究

CASIA OpenIR > 毕业生 > 硕士学位论文

	低质图像文本识别方法研究
	许铭潮
	2020-12
页数	74
学位类型	硕士
中文摘要	文本识别(从图像中识别文字并转换为数字代码)具有广泛的应用需求。近年来，随着深度学习的兴起和发展，文本识别算法在创新性、实用性和效率等方面都有明显的提升。但是，这些识别算法大多是针对高质量的文本图片。在实际应用中，光照不均匀，相机焦距差异、拍摄设备抖动等问题都会造成不同程度的图像失真和模糊。这些低质图像会造成识别的精度损失，无法满足实际应用的需求。因此，本文研究低质图像文本识别方法，主要利用超分辨率算法对低质文本图像进行恢复，从而改善识别器的性能。主要工作内容分为以下两部分: 1. 面向文本识别对多种超分辨率算法进行了评价和改进。首先，在低质场景文本图像数据集 TextZoom 中比较了 10 种前沿超分辨率算法的性能，并使用三种识别算法(ASTER、MORAN、CRNN)来测试生成图像的识别精度。在此基础上引入了空间变换网络和梯度剖面损失来提升各个超分辨率算法的生成效果。其次，本文提出了一种低质文本图像生成优化算法。该算法基于识别器的反传梯度指导生成器进行学习，从而改善识别效果，通过固定识别器参数以及引入识别损失，进一步提升了识别器精度，有效地缓解了低质图像文本识别困难的问题。 2. 提出了一个基于超分辨率和生成对抗网络的文本识别框架——SRR-GAN。该框架对传统的级联方案(图像超分和文本识别分步进行)进行了改进，在对抗学习的框架下，将文本识别任务和超分任务集成起来。通过对识别模型和超分辨率模型联合训练，该框架可以使神经网络在不同分辨率图片中学习到更通用的特征，进而对不同分辨率图像都能保持较高的识别精度。
英文摘要	Text carries rich and accurate semantic information, which is very important in many visual application scenarios. Therefore, text recognition has always been an ac- tive research topic in the field of computer vision and pattern recognition. In recent years, with the development of deep learning, numerous text recognition algorithms have shown novelty, practicality and efficiency. However, these algorithms mainly fo- cus on high quality text images. Text images can be distorted in many application sce- narios. Nonuniform illumination, camera defocus, motion blur from camera shake and low resolution can lead to low quality text images. The low quality text image will de- grade the recognition accuracy. This thesis studies recognition methods for low quality text image, by using super-resolution algorithms for image restoration, so as to improve the recognition performance. The main contributions lie in the following two aspects: 1. Several super-resolution algorithms for text recognition are evaluated and im- proved. Firstly, we compare 10 classical super-resolution algorithms on TextZoom, a low-quality scene text image dataset, and three recognition algorithms (ASTER, MORAN, CRNN) are used to test the restored image. We further use the Spatial Transformer Net- work (STN) and gradient profile loss to improve the restored image quality of super- resolution algorithm. Also, we propose to improve the recognition result based on the gradient of recognition loss. By fixing the parameters of the recognizer and introducing the recognition loss, the recognition accuracy can be improved further. The effective- ness of these techniques are verified in experiments. 2. A novel framework named SRR-GAN (Super-Resolution based Recognition with Generative Adversarial Networks), which is based on super-resolution and ad- versarial learning, is proposed. The proposed framework improves existing methods which adopt the cascade scheme by integrating text recognition with super-resolution via adversarial learning. Through the joint training of recognition and super-resolution models, we can learn more general features of images with various quality, so as to yield higher recognition performance for both high-resolution and low-resolution images.
关键词	低质图像文本识别，超分辨率，空间变换网络，梯度剖面损失，对抗学习
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/43339
专题	毕业生_硕士学位论文
推荐引用方式 GB/T 7714	许铭潮. 低质图像文本识别方法研究[D]. 中国科学院自动化研究所. 中国科学院自动化研究所,2020.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
许铭潮毕业论文V9.pdf（4602KB）	学位论文		限制开放	CC BY-NC-SA