CASIA OpenIR  > 毕业生  > 硕士学位论文
Alternative TitleText Localization on Complex Background
Thesis Advisor刘昌平
Degree Grantor中国科学院研究生院
Place of Conferral中国科学院自动化研究所
Degree Discipline模式识别与智能系统
Keyword文本定位 复杂背景、边缘密度 连通域分析 支持向量机 Text Localization Complex Background Gradients Density Connected Component Analysis Support Vector Machine
Abstract复杂背景图像及视频文本定位是一个具有很大难度性和挑战性的研究课题, 其原因是文本所处的图像或视频的背景非常复杂,图像或视频有的在室内拍摄 而有的在室外拍摄,光照条件变化大,其中不同文字的颜色、亮度、字体、大 小、间距、对比度、排列方向和背景纹理等差别很大,特别是Scene-text(背景 环境中的文字),由于摄像机的投影变换关系,图像中的字符有可能已经发生移 动、旋转、缩放、错切等形变。但是,复杂背景文本定位有着广泛的应用前景, 可以用于图像检索、视频检索、网络搜索、图像注释等场合,还可以用于图像 理解。在已有的文本定位算法当中,大多数的算法都是针对Artificial-text(人工 在图像中添加的文字),很少有针对Scene-text的算法,而且还对文本做了简单 的假设。在现有的关于复杂背景文本定位参考文献当中,大多数都是在试验图 像及视频当中的文本区域四周画上矩形框作为文章的最终结果,没有给出一个 通用的、便于比较的评价指数,主要是由于在该领域内还没有一个通用的评价 标准和一个标准的评价样本库。 本文提出两种复杂背景文本定位方法以及他们的集成方案,接着介绍了一种 复杂背景文本定位算法的评价方法。第一种方法是基于知识的文本定位方法, 先求图像的边缘密度,接着对所得图像进行二值化和连通域分析,通过规则(知 识)对连通域进行鉴定、合并,得到文本区域的位置信息。第二种方法是基于 学习的文本定位方法,采用了SVM分类器对窗口子图像进行分类:文本和非文 本。在研究当中我们发现,任何一种算法都不可能适应所有的对象,所以我们 尝试利用上面所提出的两种算法的优点,集成了一种新的算法:采用基于边缘 密度的文本定位算法得到连通域,然后用SVM定位的方法对所得连通域进行扫 描,在这里SVM起到了精定位和鉴定的双重作用。在文章最后我们介绍了一种 的文本定位算法的评价方案,同时给出了上述方法的评价结果。
Other AbstractThe localization and extraction of artificial and scene text from images and video with complex background is an important and challenging research problem in the computer vision. The variation of the text in terms of characters font, size and style, orientation, alignment, texture, color and complex background makes the problem of text localization very difficult. The scene content is unconstrained and may be both indoor and outdoor scenes under any lighting or contrast conditions. Specially, scene text on the images and video frames may also be distorted by perspective projection. But, the significance of text localization is very great, such as image retrieval, video retrieval, image notation, Web search and image understanding. The current state of the art for text localization from images and video of complex background either makes simplistic assumptions as to the nature of the text to be found, or restricts itself to a subclass of the wide variety of text can occur in image and video. Most published methods only work on artificial text that is composite on the image and video frame. It was also noticed that many methods published in the literature restrict their results to bounded text regions in frame images and lacked a thorough evaluation on general-purpose video data. This paper presents two text localization algorithms and the scheme of their fusion. One is the method based on knowledge, which using the measure of gradients density and the connected component analysis. The other is the method based on learning, which using the SVM to classify the sub-window of images or video frames as a text or a non-text. In developing methods for text localization, it was observed that no single algorithm could detect all forms of text. The strategy is to using the SVM to scanning the regions, which are generated by the gradients density and the connected component analysis. This paper also addressed a thorough evaluation of methods for localization of text from complex background image and video frames.
Other Identifier679
Document Type学位论文
Recommended Citation
GB/T 7714
朱军民. 复杂背景文本定位[D]. 中国科学院自动化研究所. 中国科学院研究生院,2003.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[朱军民]'s Articles
Baidu academic
Similar articles in Baidu academic
[朱军民]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[朱军民]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.