自然场景图像中的文本检测方法研究

CASIA OpenIR > 毕业生 > 博士学位论文

	自然场景图像中的文本检测方法研究
其他题名	Text Detection in Natural Scene Images
	潘屹峰
	2010-06-01
学位类型	工学博士
中文摘要	随着数码图片采集设备，例如数码照相机、可拍照智能手机和掌上电脑（PDA），在人们日常生活中的广泛使用，基于内容的图像分析技术受到了越来越多的重视。在所有图像所包含的内容当中，文本信息由于更易被人类和计算机理解，受到了学术界和工业界的极大关注，并被广泛应用于相关应用领域中。对于完整的图像文本信息分析系统，位于前端的文本检测和定位模块对整个系统的性能起着至关重要的作用。本文基于以上研究背景，结合模式识别、机器学习等相关领域的技术，对自然场景图像中的文本检测和定位方法进行了深入的研究。提出了三种具体的自然场景图像文本检测、定位方法。实验结果表明相比较已有的方法，本文提出的方法在精度、召汇率、速度等方面具有一定优势：考虑到纹理特征的多样性，本文提出了一种鲁棒的文本检测和定位方法。在文本检测阶段，利用多特征集合和级联AdaBoost算法对图像局部区域进行分类。公开数据集上的实验结果说明我们的方法在精度和速度方面均与现有方法具有可比性。为了综合利用局部区域信息和连通部件特征，本文提出了一种层次化的自适应场景图像文本检测和定位方法。实验结果表明我们的方法在准确率和召汇率方面优于已有的方法。同时在多语言数据库上的实验证明了此方法较好的推广性。为加速文本检测过程以接近实际应用，本文提出了一种结合区域滤波和子块验证的由粗至精文本检测方法。实验结果显示本文的方法与已有方法相比较，在不降低文本定位精度的情况下，大幅度提高了定位速度。
英文摘要	With the increasing use of digital image capturing devices, such as digital cameras, mobile phones and PDAs, content-based image analysis techniques are receiving intensive attention in recent years. Among all the contents in images, text information has inspired great interests, since it can be easily understood by both human and computer, and leads to wide applications. An integral Text Information Extraction (TIE) system contains four parts: text detection, text localization, text extraction and OCR. Thereinto, the first two parts text detection and localization are very important for the system performance. Based on these backgrounds, this paper aims to give a thorough research on the method of scene text detection and localization by utilizing pattern recognition, image processing and machine learning techniques. This paper presents three new text detection and localization methods, while the experimental results show their superiorities compared with other existing state-of-the-art methods: Considering the textual varieties of text regions and system speed requirement, we present a robust system to accurately detect and localize texts in natural scene images. Experiments on the public Dataset show that our system is comparable to the best existing methods both in accuracy and speed. To take advantages of both region-based and component-based information, we present a hybrid approach for robust scene text detection and localization. Experimental results show that our approach has better precision and recall performance compared with state-of-the-art methods. We also evaluated our approach on a multilingual image database with promising results. To speed up the text detection process for closing to practical usage, we propose a fast scene text localization method by combining learning-based region filtering and verification in a coarse-to-fine strategy. Experimental results show that the proposed method provides competitive localization performance at high speed.
关键词	文本检测文本定位级联adaboost分类器条件随机场 “由粗至精”策略 Text Detection Text Localization Cascade Adaboost Conditional Random Field (Crf) "coarse-to-fine" Strategy
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/6281
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	潘屹峰. 自然场景图像中的文本检测方法研究[D]. 中国科学院自动化研究所. 中国科学院研究生院,2010.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
CASIA_20061801462806（4472KB）			限制开放	CC BY-NC-SA