Object classification and detection is one of the fundamental problems in computer vision and pattern recognition. It is also a critical step, directly influencing many other computer vision tasks, such as object tracking, action recognition and scene understanding. Object classification and detection can be applied in many areas, e.g., visual surveillance, biometrics and image retrieval. It is also an interdisciplinary topic linking computer vision with other domains like medical imaging, neuroscience and visual psychology. Most current work on object classification and detection is based on Marr’s computational vision theory. These studies, however, ignore cognitive mechanisms. For example, they do not recognize the importance of visual saliency in feature representation. Therefore, there is a huge gap between the computer vision system and the human visual system in many aspects, e.g., robustness and efficiency. In this thesis, we attempt to address these issues. Our contributions include: 1) We improve the famous HMAX model and propose an enhanced biologically inspired model. Compared with the original model, ours enhances the speed for at least 20 times while at the same time improves the accuracy for object classification. 2) We study visual saliency of local features and develop salient coding for the codebook based model. We prove, from the viewpoint of geometrical and numeric analysis, that salient coding is better than other coding methods. Moreover, the computational complexity of salient coding is much lower than that of current best coding schemes. 3) We present a codebook graph based model. This model is, to our best knowledge, the first to model the relations of visual words in feature space. The new framework greatly enhances the accuracy of object classification. Meanwhile, we prove that traditional codebook models are special cases under our framework. The proposed model achieves the state-of-the-art performance in a number of popular object classification databases. 4) We integrate the information of scenes and object classes into object detection. This strategy is useful to enhance the accuracy of object detection. The system based on this work is ranked the best performing object detection algorithm in PASCAL VOC2010. 5) We propose a global-to-local shape description method inspired by the topological perceptual organization theory. It addresses the problem that traditional shape classification methods ignore global propert...
修改评论