Image object recognition is one of the fundamental problems in computer vision. It covers several important tasks including image classification, object detection as well as image segmentation. Image recognition is also the first problem to be solved before handling much higher level visual semantic analysis. In recent years, image recognition based on Bag of Visual Words(BoVW) and deep learning has made great progress in many difficult image recognition datasets. However, there is still a big gap between computer vision systems and human visual system. The large variations of scale, illumination, viewpoint and deformation in real images as well as severe occlusion appears to be no difficulty for human. But these problems are great challenges for our image recognition algorithms. Investigating the difficulties in image recognition, improving the feature representation theory based on the visual perception theory and bridging the gap between human vision and computer vision are of great theoretical value and pressing practical demand. As more image recognition system deployed in real applications, the big data explosion has also make new challenge and new need for our computer vision algorithms. Efficient feature extracting methods and online learning frameworks suitable for parallel computing have also became an important research topic. In this thesis, from the basic idea of hierarchical feature learning, we attempt to improve the image recognition system with the following contributions: 1.We intensively study the basic feature learning modules for image object recognition. For better understanding of the basic feature learning unit, we have the following work: 1) We propose a Local Hypersphere Coding(LHC) algorithm that performs feature encoding based on the differences between visual words. As a single layer feature learning algorithm, LHC produces more discriminative feature representations and better image recognition performance. 2) we propose a maximum correntropy auto-encoder (MCAE) which learns more robust and discriminative representations than MSE based model by performing computation in an infinite dimensional kernel space. 2. We make in-depth study on hierarchical feature learning for image object recognition. Our contributions include: 1) We exploit the power of kernel by learning a kernel embedding neural network which explicitly maps data from Euclidean space to an approximated kernel space. 2) We propose a convolutional nonlinear f...
修改评论