CASIA OpenIR  > 毕业生  > 硕士学位论文
强噪声条件下的印刷体符号识别
董建雄
Subtype工学硕士
Thesis Advisor刘迎建
1999-05-01
Degree Grantor中国科学院自动化研究所
Place of Conferral中国科学院自动化研究所
Degree Discipline模式识别与智能系统
Abstract近二十年来,符号识别技术取得了很大的进展.但近年来,社会需 要对符号识别技术提出更高的要求。在一些具体领域如银行帐单、税务 票据识别,它要求有非常高的识别准确性和可靠性。通常的OCR技术 已不适应这方面的要求。在这样的背景下,本文通过对符号识别问题的 深入分析,提出了分类器集成方案和符号识别可靠性等一系列有效技 术,并把它们应用到增值税发票自动处理系统中去.论文的主要内容如 下。 首先分析了强噪声条件下的印刷体符号识别研究的背景,指出该问 题的性质和难点,介绍了当前符号识别技术的发展状况,指明提高符号 识别系统性能可采用的一些有效技术。对于BP分类器,结合AdaBoost 机器学习算法提出一种新的训练算法,有效地改善了神经网络的推广能 力;在特征字的比较过程中,对几种有代表性的方法进行了实验比较, 得出特征向量集成是一种有效的技术;同时,本文还提出基于样本质量 分类的多识别器集成方案,获得很好的结果,为噪声条件下的字符识别 提供一种新的思路;最后,引入符号识别可靠性概念,基于多专家判定 和图象上下文提出一种实用的解决方案。 跟现有技术相比,本文的分类器集成方法和可靠性方案具有开拓 性,它们对强噪声条件下的符号识别性能有明显的改善.
Other AbstractSymbol recognition techniques achieve a great progress in the late twenty years. But in recent years, social backgrounds urge more achievements in symbol recognition. It requires very high recognition rate and high reliability in some applied fields such as bank check and tax form recognition The general OCR techniques do not adapt these requirements. Under this Background, the thesis presents some effective techniques of combination of multi-classifiers and solution of reliability via deep analysis of the problem nature. These techniques have been successfully applied to the automatic entry system of tax form. The main contents of the thesis are as follows. Firstly the research background of machine-printed symbol recognition under bad noises is analyzed. The nature of its problem and difficulties is pointed out. The state of symbol recognition researches is also surveyed and some effective techniques for improving performance of symbol recognition system are proposed. In respect to backpropagation classifier, a new training method combined with AdaBoost algorithm is presented, which greatly improves the generation of neural network. In the process of feature character comparison, some existing methods are implemented and tested comparatively in experiments. As a result, a conclusion is drawn that synthesis of feature vector is an effective technique. In addition, the thesis presents a new method of combination of multi-classifiers based on the quality of training samples and a good result is obtained. It provides an original idea for character recognition under the noise conditions. Finally, conception of reliability of symbol recognition is introduced and an applicable method based on the vote of multi-experts and image context is proposed. Relative to the current state of symbol recognition research, the method of combination of multi-classifiers and solution of symbol recognition reliability in this thesis are creative. The performance of symbol recognition under bad noises has been improved obviously.
shelfnumXWLW555
Other Identifier555
Language中文
Document Type学位论文
Identifierhttp://ir.ia.ac.cn/handle/173211/7287
Collection毕业生_硕士学位论文
Recommended Citation
GB/T 7714
董建雄. 强噪声条件下的印刷体符号识别[D]. 中国科学院自动化研究所. 中国科学院自动化研究所,1999.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[董建雄]'s Articles
Baidu academic
Similar articles in Baidu academic
[董建雄]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[董建雄]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.