场景文本检测与抽取方法及应用

CASIA OpenIR > 毕业生 > 硕士学位论文

	场景文本检测与抽取方法及应用
其他题名	Research and Application of Secne Text Detection and Extraction
	胡仅龙
	2014-05-29
学位类型	工学硕士
中文摘要	随着数码相机、数码摄像机、摄像头和超高速扫描仪等图像获取设备的广泛应用，以数字图像和视频为主的多媒体信息正迅速成为信息交流与服务的主流。如何让计算机自动理解并利用图像和视频等多媒体文档的内容，已经成为当前图像处理和多媒体领域研究的一个热点。在图像所包含的内容当中，文本信息由于更易被人类和计算机理解，受到了学术界和工业界的极大关注。相关技术已经被广泛应用于车牌识别、交通标识翻译、基于内容的网络图像搜索等领域。完整的图像文本分析系统包括文本检测、提取和识别三部分。其中位于前端的文本检测和抽取模块对整个系统的性能起着至关重要的作用。由于图像中背景的复杂性和文本位置、大小、字体、颜色、极性和排列等变化性，文本检测和抽取是一个具有挑战性的问题。本文基于以上研究背景，结合图像处理和模式识别等相关领域的技术，对自然场景图像中的文本检测和抽取方法进行了深入研究。提出了一种具体的自然场景图像中文本检测和抽取方法。实验结果表明，相比较已有方法，本文提出得方法在精度、召汇率、速度等方面具有一定优势。本文的主要工作概括如下：第一，本文提出一种交互式的自然场景文本检测的方法。这是一种自适应复杂背景文本图像检测的方法，采用“由粗到精”的检测技术。粗检测采用尽量精简的算法，保证算法速度和检测的高召回率，精检测着重提高检测结果的准确率。整体算法既确保了算法的高召回率和准确率，又保证了算法的速度。在实验室所收集的数据集上的实验结果证明了我们的方法能够精确、快速地检测和定位场景图像中的文本区域。第二，针对复杂背景中的文字抽取问题，本文提出了一种利用局部信息的文本抽取算法。该方法将文本抽取看作一个噪声过滤的过程。我们利用边缘增强，结合局部灰度信息进行二值化，去除文本区域噪声，并尽量分离背景和噪声，然后再利用基于文字特性的连通域分析，去除复杂背景噪声。通过在实验图像和真实数据集的算法测试，验证本文提出的文本抽取算法的有效性。第三，在文本检测和抽取算法的基础之上，设计并实现了一个图像识别的原型系统。本系统是一个智能手机应用系统，它的主要功能是让用户使用手机对感兴趣的区域进行拍照，并利用交互式的方式提供一些引导信息，然后将数据传送到网络进行识别处理，最后将处理得到的相关识别信息返回给用户。实验结果表明该原型系统具有良好的识别性能和运行速度。
英文摘要	Nowadays the amount of digital images and videos increases explosively with the development of high technology. And these multimedia documents contain a great deal of information which is valuable for many applications, such as information retrieval, image classification, data mining and etc. However, it is still very difficult for computers to understand the contents of images and videos. Among all the contents in images, text information has inspired great interests, since it can be easily understood by both human and computers, and leads to wide applications such as license plate reading, sign detection and translation, mobile text recognition, content-based Web image search, and so on. An integral Text Information Extraction system consists of three parts: text detection, text extraction and OCR. The first two parts, text detection and extraction, are critically important for the system performance. Text detection and extraction from images is a challenging problem due to the complexity of background, the variability of text position, size, font, color, polarity and line orientation. Based on this background, this dissertation presents an in-depth study on scene text detection and extraction by combining techniques in image processing and pattern recognition. Specifically, we propose a new text detection and extraction method, and the experimental results demonstrate the superiority of our methods compared with the state-of-the-art methods. The contributions of this dissertation are summarized below: Firstly, this dissertation proposes an interactive method to detect interesting texts in natural scene images. We first draw a line to label a region which contains the texts we want to detect. Then a coarse-to-fine strategy is adopted to detect the texts in this label region. For coarse detection algorithm, we use as concise as possible to ensure high speed and recall rate. For fine detection, we focus on improving the accuracy of the detection results. Overall algorithm to ensure both a high recall rate and accuracy, and ensure the high speed of algorithm. Experimental results demonstrate very promising performance on detecting texts in complex natural scenes. Secondly, this dissertation proposes a text extraction algorithm incorporating local information. We regard text extraction as segmenting the image and removing noises, and then a robust text extraction menthod incorporating local information is proposed. First, we get the gray image from the o...
关键词	文本检测文本抽取 “由粗到精”策略局部信息 Text Detection Text Extraction “coarse-to-fine” Strategy Local Information
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/7720
专题	毕业生_硕士学位论文
推荐引用方式 GB/T 7714	胡仅龙. 场景文本检测与抽取方法及应用[D]. 中国科学院自动化研究所. 中国科学院大学,2014.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
CASIA_20112801462803（3956KB）			暂不开放	CC BY-NC-SA