Object categorization is an important part of research on image understanding. With the widespread image capture devices and development of Internet technology, it has very promising application aspects to make the computer be capable of classifying images by the content automatically. At the same time, research on object categorization will boost the development of image understanding, and fruits of the research can be applied to many other areas, such as intelligent surveillance, intelligent transportation, video annotation, etc. Recently, there is a large literature on object categorization based on local features. This dissertation systematically analyzes the existing algorithms and summarizes the problems needed to be solved. Specifically, this dissertation focuses on local feature extraction and local feature learning. The main contribution consists of: 1 Constructing a novel partially contextual descriptor to improve the discriminative power of local features. By considering the distribution of other keypoints in the local neighborhood of one keypoint, a novel contextual descriptor of this keypoint is constructed. Furthermore, by spatially dividing the local neighborhood, partially contextual descriptor is constructed to take into consideration of the relative spatial information of keypoints explicitly. Experimental results prove the proposed descriptor is superior to traditional local descriptors. 2 Learning a discriminative function to measure the similarity of two keypoints instead of using some heuristic predefined distance. Multiple instance learning is used to model the appearance of keypoints which correspond to the same semantic concept in different images. In the framework of AdaBoost, some most discriminative local features are selected and combined to form the final object classifier. Experimental results show the effectiveness of the proposed method. 3 Proposing Image Embedding Space to obtain the discriminative combination pattern of local features for each image. Discriminative combination patterns reflect the characteristics of different sub-categories. AdaBoost implicitly combine the representative sub-categories to form the final classifier. Experiments show it is more effective to learn the discriminative combination pattern of local features for each image than serially learn the single discriminative local feature. 4 Furthermore, Embedding Space can also be applied to multiple similarities fusion. In each embedding space, ...
修改评论