With the development of computing technology and the emergence of a variety of digitization devices, increasing paper documents are needed to transformed to electronic form, to facilitate convenient search, editing and transmission. Handwritten Chinese character recognition has been studied for nearly a half century and has achieved tremendous advances. However, most works were focused on isolated character recognition or constrained handwriting recognition. Only in recent years, the analysis of unconstrained handwritten documents is receiving intensive efforts, and the reported performance is very low. This thesis presents a systematic work on handwritten Chinese text recognition, considering four major issues: path evaluation criterion, path search algorithm, language model adaptation and integrating common sense knowledge. The path evaluation and search techniques are crucial for handwritten text recognition. To achieve higher recognition performance, we introduce language model adaptation and integrate the common sense knowledge. The main contributions of this work are as follows: 1. From the Bayesian decision view, we proposed a statistical path evaluation function. This function integrates multiple contexts, including character classification, geometric context and linguistic context, and we also convert the classifier outputs to posterior probabilities via confidence transformation. To overcome the bias of path length, we present several heuristic modified path evaluation criteria. Moreover, to balance the effect of different context models, the combing weights are optimized by a supervised parameters learning method using a maximum character accuracy criterion. 2. To find out the path with maximum path evaluation score efficiently, we present a refined beam search algorithm by splitting the pruning strategy into two stages. And meanwhile, we improve the candidate path completeness by augmenting characters based on the character confusion information and the linguistic context. 3. To overcome the mismatch between the given language model and the recognition text, we present a dynamic and unsupervised language model adaptation method. Using a two-pass recognition strategy, we choose the best matched language model from a pre-defined language model set to improve the recognition performance further. Meanwhile, to overcome the storage problem of the pre-defined language model set, we compress each language mode to a moderate size using two metho...
修改评论