With the development of information technology, videos have been becoming an important media for transmitting and obtaining information. Text in videos is a powerful source of high-level semantics, which can provide useful cues for video logging and indexing. If these text occurrences could be detected, segmented and recognized automatically, they could be used to content-based video search, automatic video logging, text-based video indexing and so on. The work of this paper is mainly focused on Video OCR research, which aims at extracting the text information in videos, including video text localization, tracking, segmentation and recognition. The main work of this paper is as follows: 1) Four datasets are established and labeled : CASIA-TRAIN, CASIA-IMAGE, CASIA-TEXT and CASIA-VIDEO, for the research of text texture classifier testing and training, text localization, text segmentation and video text recognition. 2) We introduce the Text-Noise-Ratio (TNR) as a measurement for the complexity of the text background, and give an approximate calculation of TNR when the text region is unknown. Based on this, a novel text localization method is proposed in this paper. The basic idea of this method is to handle the texts under different background by different methods. Experimental results show that our method performs well under different backgrounds. 3) We proposed a new texture feature for precise text detection.Firstly, the text region is partitioned to 64 blocks; then the GSC feature is extracted to eliminate the influence of the background, and the EOH feature to express the statistical feature of text in different blocks. Comparison experiments prove that the proposed feature is quite effective for precise text detection. 4) For text segmentation, we propose a new method based on stroke and color. The candidate text regions are firstly extracted by the stroke operator to get the model of text color distribution; secondly, a coarse segmentation is carried out by the color model; finally, the feature of text color consistency is used to eliminate the noises in the coarse segment. Experiment results show that our method is quite robust to non-text noises.
修改评论