CASIA OpenIR
DetectGAN: GAN-based text detector for camera-captured document images
Zhao, Jinyuan1,2; Wang, Yanna1; Xiao, Baihua1; Shi, Cunzhao1; Jia, Fuxi1; Wang, Chunheng1
Source PublicationINTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION
ISSN1433-2833
2020-08-10
Pages11
Corresponding AuthorZhao, Jinyuan(zhaojinyuan2016@ia.ac.cn)
AbstractNowadays, with the development of electronic devices, more and more attention has been paid to camera-based text processing. Different from scene image, the recognition system of document image needs to sort out the recognition results and store them in the structured document for the subsequent data processing. However, in document images, the fusion of text lines largely depends on their semantic information rather than just the distance between the characters, which causes the problem of learning confusion in training. At the same time, for multi-directional printed characters in document images, it is necessary to use additional directional information to guide subsequent recognition tasks. In order to avoid learning confusion and get recognition-friendly detection results, we propose a character-level text detection framework, DetectGAN, based on the conditional generative adversarial networks (abbreviation cGAN used in the text). In the proposed framework, position regression and NMS process are removed, and the problem of text detection is directly transformed into an image-to-image generation problem. Experimental results show that our method has an excellent effect on text detection of camera-captured document images and outperforms the classical and state-of-the-art algorithms.
KeywordText detection Camera-captured document images Multi-scale context features Generative adversarial networks
DOI10.1007/s10032-020-00358-w
WOS KeywordSEGMENTATION
Indexed BySCI
Language英语
Funding ProjectNational Natural Science Foundation of China (NSFC)[71621002] ; Key Programs of the Chinese Academy of Sciences[ZDBS-SSW-JSC003] ; Key Programs of the Chinese Academy of Sciences[ZDBS-SSW-JSC004] ; Key Programs of the Chinese Academy of Sciences[ZDBS-SSW-JSC005]
Funding OrganizationNational Natural Science Foundation of China (NSFC) ; Key Programs of the Chinese Academy of Sciences
WOS Research AreaComputer Science
WOS SubjectComputer Science, Artificial Intelligence
WOS IDWOS:000558134800001
PublisherSPRINGER HEIDELBERG
Citation statistics
Document Type期刊论文
Identifierhttp://ir.ia.ac.cn/handle/173211/40435
Collection中国科学院自动化研究所
Corresponding AuthorZhao, Jinyuan
Affiliation1.Chinese Acad Sci CASIA, Inst Automat, 95 Zhongguancun East Rd, Beijing 100190, Peoples R China
2.Univ Chinese Acad Sci UCAS, 19 A Yuquan Rd, Beijing 100049, Peoples R China
First Author AffilicationInstitute of Automation, Chinese Academy of Sciences
Corresponding Author AffilicationInstitute of Automation, Chinese Academy of Sciences
Recommended Citation
GB/T 7714
Zhao, Jinyuan,Wang, Yanna,Xiao, Baihua,et al. DetectGAN: GAN-based text detector for camera-captured document images[J]. INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION,2020:11.
APA Zhao, Jinyuan,Wang, Yanna,Xiao, Baihua,Shi, Cunzhao,Jia, Fuxi,&Wang, Chunheng.(2020).DetectGAN: GAN-based text detector for camera-captured document images.INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION,11.
MLA Zhao, Jinyuan,et al."DetectGAN: GAN-based text detector for camera-captured document images".INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION (2020):11.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Zhao, Jinyuan]'s Articles
[Wang, Yanna]'s Articles
[Xiao, Baihua]'s Articles
Baidu academic
Similar articles in Baidu academic
[Zhao, Jinyuan]'s Articles
[Wang, Yanna]'s Articles
[Xiao, Baihua]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Zhao, Jinyuan]'s Articles
[Wang, Yanna]'s Articles
[Xiao, Baihua]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.