英文摘要 | Vision is the primary way of realizing and understanding the world for human beings. People have long been dreaming of reproducing the human visual abilities for artificial systems. Visual classification is one of the most attractive, essential, yet challenging visual functions, while it is also of great importance for many practical applications and areas, including intelligent robot, human-machine interaction, information retrieval, security surveillance, etc. In the past decade, the research on visual classification methods has achieved a series of milestones for the advances in the related fields such as image processing, machine learning and pattern recognition. However, there are still large gaps between artificial visual classification systems and biological visual systems in accuracy, generalizability, stability and learning efficiency. Meanwhile, the new insights and ways of building artificial systems come out with the findings about visual cognitive ability from neuroscience in the recent years. Following these ideas, this thesis aims at developing visual classification algorithms inspired by cognitive mechanisms in vision, resulting in the accomplishments listed as below: 1.A novel piece-wise linear classifier applicable to binary visual classification problems is proposed, which is based on the hierarchical structure and max-pooling mechanism, while the corresponding training algorithm is also presented. Compared with the linear classifiers usually employed in the visual tasks, the proposed new method produces higher accuracy with the enhanced invariance against general intra-class variance in appearance. In comparison to other non-linear classifiers like kernel based classifiers, the proposed method achieves competitive accuracy while significantly improving the computational efficiency during prediction. 2.Based on the previous work of classifier designing, the HMAX model, which is a kind of biological visual model, is improved with the addition of new pooling layers imitating the connections between view-tuned and view-invariant neurons found in high-level visual cortex. Consequently, according to the features of the new model, the full scheme of building the model, including the methods of template selection, incremental learning for initial construction of the model and finely tuning of the model, is also specially devised. In the classification of natural images, the new model exhibits better accuracy and efficiency in comparison ... |
修改评论