英文摘要 | Image understanding, including image matching, image classification and object detection, is a basic issue and important link in computer vision and pattern recognition. It is a fundamental issue in camera calibration, image based 3D reconstruction, content based image retrieval, object tracking, action recognition and so on. It affects the research and development of intelligent visual surveillance, web image and video understanding and management, large scale visual data mining. Besides, it provides computational experiments for the cognitive sciences and help us to understand our brain better. Recently, the popular work on image understanding algorithm are usually based on image processing techniques with statistic based machine learning algorithms. These studies obtained amazing development in past ten years, and in some areas, they got successful applications. However, the results of current algorithms still have a gap between computer and human brain, in the challenge on discrimination and robustness of view, illumination, deformation, occlusion etc. These challenges are the most difficult problems in this area. Latent variable based pattern recognition is a new and promising model. The latent variable based model consider the observed data and the latent data simultaneously, modeling both the observed data and latent data, toward a more comprehensive, discriminative and robust model. This thesis mainly focuses on image recognition, especially the image matching, image classification and object recognition. We attempt to modeling the traditional problem under the latent variable model framework and design new model and learning algorithm, including: 1) We model the view and illumination change in image matching with latent variables. We consider that there is no full invariant local feature detector and descriptor. We mainly focus on the challenge of image matching when large view and illumination change. With the latent view and illumination variables, the image matching with large view and illumination will transmit to a low or no view and illumination change problem, which can be solved by traditional image matching algorithms. 2) We study the structure of image and import latent structure variable as a important issue in the traditional Bag of Visual Words model (BoVW). The traditional BoVW model ignores the spatial structure of image. In order to solve this problem, Spatial Pyramid Matching (SPM) is proposed. However, SPM model the s... |
修改评论