The problem of handwritten Chinese character recognition (HCCR) has been investigated by many researchers for its theoretical significance and potential in many applications. The performance of constrained handwritten Chinese character recognition has achieved great improvement. However, the research on the unconstrained handwritten Chinese character recognition is far from enough and the recognition accuracy of unconstrained handwritten characters is still not satisfactory, which restricts many applications of character recognition. In this thesis, three algorithms are proposed to alleviate the limitation of some existing methods in HCCR, including learning a better character normalization method, designing a fast and effective classifier combination method and exploring a similar character discrimination method which is more suitable for unconstrained handwritten Chinese characters. The main work and contributions are presented as following: Firstly, a visual word density (VWD) based nonlinear normalization method is proposed. In contrast to the traditional nonlinear normalization methods which only minimize the within-class variance, the proposed method minimizes the ratio of within-class variance and between-class variance, in which both the within-class variance and the between-class variance are considered. Moreover, feature extraction is involved in the learning procedure of the proposed method which makes the relationship between normalization and feature extraction closer than the traditional relationship between them. This is beneficial for classification. Experimental results on constrained and unconstrained handwritten Chinese character databases show that the proposed method outperforms the traditional normalization method including dot density based and line density based nonlinear normalization methods. Secondly, a fast self-generation voting (FSGV) based method is proposed considering the characteristics of HCCR. Combining classifiers can exploit their individual advantages in order to reach an overall better performance than could be achieved by using each of them separately. Due to the large number of categories and lacking training samples, directly applying most of the existing classifier combining methods on the HCCR problem would fail to perform well. In the proposed method, a virtual testing set is first generated by the proposed fast self-generation method. Then each sample in the virtual testing set is classified by a baselin...
修改评论