With the development of deep learning, character string recognition methods have
string recognition methods are insufficient in locating characters and outputting reliable
confidence. Therefore, this thesis study character string recognition methods based on
sliding-window and connectionist temporal classification (CTC), and have achieved a
better convergence and improved recognition accuracy of the model.
The main contributions of the thesis are summarized as follows:
1. An improved CTC method based on pseudo-label distribution is proposed.
Our theoretical analysis of the CTC algorithm found that this method can be ex-
plained as the Expected Maximum (EM) algorithm in sequence recognition. Using the
model prediction of each frame, CTC estimates the pseudo-label distribution through
the forward-backward algorithm and trains the model with cross-entropy loss. Based
on this explanation, an improved CTC method is proposed, which contains two im-
proved strategies: a regularization strategy based on pseudo-label distribution and a
voting-based decoding algorithm. Experiments on handwritten digit string recognition
and handwritten English text line recognition show that our methods can improve the
convergence and recognition accuracy of CTC string recognition method.
2. A string recognition method based on convolutional prototype classifier is pro-
The convolution prototype classifier has yielded high character recognition accu-
racy and reliable output confidence. Therefore we use the convolutional prototype clas-
sifier for sliding window classification in string recognition. In the end-to-end training
of model, a character position estimation step is added to improve the alignment effect
by concentrating on more accurately aligned frames. Experimental results on handwrit-
ten digit strings, handwritten English, and Chinese text lines show that this method can
achieve competitive recognition performance compared to state-of-the-art methods.