CASIA OpenIR  > 模式识别国家重点实验室
基于生成式对抗网络的场景文字消除方法研究
边学伟
Subtype硕士
Thesis Advisor叶军涛 ; 严冬明
2020-06
Degree Grantor中国科学院大学
Place of Conferral中国科学院自动化研究所
Degree Discipline计算机技术
Keyword文字消除 图像修复 图像分割 生成式对抗网络 文字检测
Abstract

基于图像的场景理解是计算机视觉主要研究问题之一。在很多应用中,文字是一种重要的场景信息载体。针对图像中的文字进行检测、识别或者消除有着重要的研究价值。
文字消除作为一个非常细化的领域,目前并没有获得足够多相关学者的关注。目前针对文字消除任务,仅有数篇基于深度学习的相关工作。这些工作在处理真实数据时,表现得不尽如人意,主要表现为文字消除不彻底、消除结果视觉效果差等。这一方面是因为这些工作没有对文字的特殊性予以充分考虑;一方面是因为这些工作的网络设计还有待改进;另一方面在训练相关模型时,因为彼时还不存在基于真实数据的数据集,因而只能在合成数据集上做训练,限制了模型在真实数据上的表现。
本文对文字消除领域现有的问题进行深入的思考,并做出自己的贡献,本文的主要研究内容可以归纳如下:
1. 本文提出一种基于文字笔画检测的文字消除算法。该算法构建的网络具有生成式对抗网络的基本框架,其中生成器由堆叠的文字笔画检测网络和文字笔画消除网络组成,鉴别器通过对SN PatchGAN 中的鉴别器进行针对文字消除任务的优化得到。
2. 针对目前还不存在满足文字消除任务要求的基于真实数据的数据集的情况,本文构建了满足上述条件的真实数据集。在构建数据集时,我们将图像中文字的语言种类,文字所在的场景,文字本身的颜色、字体等特征考虑入内,构建了一个足够多样性的数据集。
实验结果表明,本文构建的数据集相比于现有的合成数据集更契合文字消除任务,本文提出的文字消除算法相对于现有算法表现出很强的优越性。

Other Abstract

Image-based scene understanding is one of the main research field in computer vision. In many applications, text is an important carrier of scene information. Detecting, recognizing or removing text in images are important in various scenarios, such as auto-translation or information hidden.
As a very specific problem, text removal has not attracted enough attention, there are only a few related studies based on deep learning for this purpose. These approaches show unsatisfying performance when processing real data, including incomplete text removal and poor visual effects of the removal results. This is partially because existing studies did not fully consider the particularity of the text, and the network of these methods also needs to be further improved. Another reason is that there is no real-world dataset which satisfies the demand of text removal when these studies were proposed, which limits the performance of these methods on real data.
This thesis studies the text removal problem, and proposes a novel method based on text stroke detection. The main contributions of this thesis can be summarized as follows:
1. We proposes a text removal method based on text stroke detection. The network constructed by this method has a basic framework of a generative adversarial network. The generator consists of a cascade of text stroke detection networks and text stroke removal networks. The discriminator is obtained by modifying the discriminator proposed in SN PatchGAN.
2. Aiming at the fact that there exists no dataset based on real data that meets the requirements of text removal, we construct a real-world dataset that meets the above conditions. When constructing the dataset, we took into account the language of the text in the image, the scene where the text is located, the color, font and other characteristics of the text itself, and built a sufficiently diverse dataset.
Our results demonstrate that the proposed method exceeds existing text removal methods by a large margin, and our real-world dataset is more suitable for text removal compared to existing synthetic dataset.

Pages82
Language中文
Document Type学位论文
Identifierhttp://ir.ia.ac.cn/handle/173211/39173
Collection模式识别国家重点实验室
Recommended Citation
GB/T 7714
边学伟. 基于生成式对抗网络的场景文字消除方法研究[D]. 中国科学院自动化研究所. 中国科学院大学,2020.
Files in This Item:
File Name/Size DocType Version Access License
边学伟-基于生成式对抗网络的场景文字消除(10619KB)学位论文 开放获取CC BY-NC-SAApplication Full Text
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[边学伟]'s Articles
Baidu academic
Similar articles in Baidu academic
[边学伟]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[边学伟]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.