英文摘要 | Recently, the Financial Document Analysis and Recognition System (FDARS) is a hot research topic, which includes form classification, image processing, character segmentation and recognition, document image coding, etc. In this dissertation, several key components of FDARS have been studied. Furthermore, an applied FDARS is implemented, which has been applied in more than one hundred bank-related systems. The research work in this dissertation can be described as follows: 1. In this dissertation, we introduce a hierarchical method for classifying financial documents using a binary tree decision, which includes three classifiers: the first classifier for elastic matching of the form structure, the second classifier for recognizing of the title of the document, and the third classifier for confirming of the color of the document. These classifiers hierarchically process a provided document image. 2. A segmentation and recognition system based on Viterbi algorithms is proposed for touching and broken printed numeral strings. This system includes two steps. In the second step, first, a segmentation method finds character nonlinear segmentation paths by combining gray scale and binary information based on a Viterbi algorithm; then, a recognition method uses a Viterbi algorithm to dynamically combine and recognize the character candidates with their reliabilities generated from the recognizer. 3. In this dissertation, a strategy of boosting based feature combination is introduced, where a variant of boosting is proposed to integrate different features. Different from the general boosting, at each round of this variant boosting, some weak classifiers are built on different feature sets, one of which is trained on one feature set. And then these classifiers are combined by weighted voting into a single one as the output classifier of this round. 4. An image compression algorithm with Region-Of-Interest (ROI) using JPEG2000 is proposed to code financial document images. Three types of ROIs: filled information ROIs, seal ROIs, and handwriting ROIs, are detected and extracted through document knowledge analysis and handwriting identification. A ROI mask with a random shape is constructed by thresholding and merging these ROIs. Finally, a financial document image is encoded using JPEG2000 Part 1 with this ROI mask. Compared to JPEG and DjVu, the method improves visual quality while decreasing storing space. 5. Based on the above techniques, we have designed and implemented a financial document analysis and recognition system, which has been applied in many financial-related systems of some Chinese banks. |
修改评论