|Place of Conferral||中国科学院自动化研究所|
|Keyword||图像质量评估 文字检测 文字识别 卷积神经网络|
（3）在通用目标检测中Focal Loss的基础上，提出了基于Focal Loss的票据文字检测方法，实验结果证明了该方法能够有效地检测任意方向的文字。
With the development of computer software and hardware technology and the wide application of mobile phones, multimedia information based on digital images and videos is rapidly becoming one of the mainstream ways of information exchange. The text in the image can express high-level semantic information, so the need for automatic detection and recognition of the text in the image is increasing. With the wide application of mobile terminals such as mobile phones, camera-captured images are increasingly occupying an important position. The invoice is a common document in our life, and the automatic recognition of the camera-captured invoice image has a strong advantage, which can save a lot of manpower. However, the types of invoices are numerous and the layout is complicated. The key information of invoices is not the same and the invoice is easy to bend and deform. Photographing causes blurring, shadows, reflections.These problems make the text recognition for the camera-captured invoice image difficult.
This paper has carried out a series of researches on the problem of recognition for the camera-captured invoice image. The main contents of this paper are as follows:
(1) We design and implement an image quality evaluation algorithm for the camera-captured document image based on local gradient distribution. The method can select the best quality one among the multiple continuous shooting sequence images, and judge the quality of the image. The poor quality of the images will not be recognized later. The experiment proves that this method can effectively solve the problem of choosing the image captured by mobile phones.
(2) A method of registering and classifying new invoice images was designed and implemented for the variety of invoices. The method uses CNN to extract features, GLVQ to learn templates and KNN to classify. Not only can some common invoice images be classified, but also can quickly support the new invoice identification with only a little of new invoice samples.
(3) Based on Focal Loss in the general target detection, the method of invoice text detection with Focal Loss is proposed. The experimental results prove that the method can effectively detect text in any direction.
(4) We propose an adaptive end-to-end text line recognition model. The method adds deformable convolution to expand the range of receptive fields, enabling the network to adaptively learn a segmentation method.
(5) Productization:To solve the problem that the large number of new or legacy invoices in the financial fields need to be manually entered, we have built a smart cloud financial sharing service platform, to realize the structured identification of camera-captured invoice image by means of text detection and recognition technology. The algorithm designed in this paper is the core of the system, and the online application verifies the validity and practicability of the algorithm.
|王淼. 拍照票据图像识别方法与系统[D]. 中国科学院自动化研究所. 中国科学院大学,2019.|
|Files in This Item:|
|拍照票据图像识别方法与系统.pdf（4089KB）||学位论文||开放获取||CC BY-NC-SA||Application Full Text|
|Recommend this item|
|Export to Endnote|
|Similar articles in Google Scholar|
|Similar articles in Baidu academic|
|Similar articles in Bing Scholar|
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.