CASIA OpenIR  > 毕业生  > 硕士学位论文
自然场景文字切分和文本行识别方法研究
贺欣
Subtype工学硕士
Thesis Advisor刘成林
2016-06
Degree Grantor中国科学院大学
Place of Conferral北京
Keyword场景文字识别 过切分 递归神经网络
Abstract
文字识别是模式识别领域的核心分支之一,近年来,场景文字识别这一子问题得到研究者的广泛关注,场景图像中的文字识别对比传统的印刷文档识别和手写文档识别有其独特的问题,例如图像中背景往往比较复杂,图像质量受光照、分辨率等影响较大,这些特点使得场景文字识别具有很大的挑战性。本文以场景图像中的英文词识别和数字串识别为任务,对自然场景文字中的切分和文本行识别方法进行了研究,主要研究内容分为两部分:
 
1、提出了一种基于多层感知机的场景文字过切分方法。该方法利用神经网络分类器的高效的判别性能,以滑动窗的形式在文本行中定位字符间的间隔,对比传统的基于启发式的过切分取得了更高的切分点召回率和精度。基于该方法的场景文字识别系统在多个标准数据集上取得了比现有方法更好的性能。
 
2、提出了一种基于递归神经网络(Recurrent Neural Network, RNN)的文本行识别方法。本文在标准RNN的基础上采用长短时记忆(Long Short Term Memory, LSTM)模块替换神经网络中的隐层节点,并将标准的RNN 扩展到双向网络以更好地捕捉文本行中的上下文信息,进一步结合序列化梯度方向直方图特征,在场景图像数字串识别中取得了较好的结果。
Other Abstract
Text recognition is one of the core branches of pattern recognition. In recent years, the subproblem of scene text recognition has drawn great attention from many researchers and received intensive study. Text recognition in scene images faces unique challenges compared to printed document recognition and handwritten recognition. The background in scene images are more complicated and the image quality is often affected by illumination and the resolution. Oriented to English word recognition and numeric string recognition in scene images, this thesis studies character over-segmentation and text line recognition methods. Our efforts and contributions are divided into two parts:
 
1.We propose a multi-layer percepetrons(MLP) based over-segmentation method. We utilize the high discrimination ability of neural networks to detect segmentation points between characters in a sliding window manner. This method largely improves the precision and recall rates of segmentation points, and results in higher recognition accuracy of scene text than existing methods on some benchmark datasets.
 
2.We propose a Recurrent Neural Network(RNN) based method to recognize  text lines in scene images. Specifically, we substitute the hidden neurons of standard RNN by the long short term memory blocks, and expand the network to a bidirectional model. Further, we combine the RNN with serialized HOG features and achieve promising recognition results on numeric strings.
Language中文
Document Type学位论文
Identifierhttp://ir.ia.ac.cn/handle/173211/11765
Collection毕业生_硕士学位论文
Affiliation中国科学院自动化研究所
Recommended Citation
GB/T 7714
贺欣. 自然场景文字切分和文本行识别方法研究[D]. 北京. 中国科学院大学,2016.
Files in This Item:
File Name/Size DocType Version Access License
自然场景文字切分和文本行识别方法研究_贺(2070KB)学位论文 暂不开放CC BY-NC-SAApplication Full Text
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[贺欣]'s Articles
Baidu academic
Similar articles in Baidu academic
[贺欣]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[贺欣]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.