英文摘要 | Object detection is a fundamental problem in computer vision, and its performance has direct influence on many other problems such as object tracking, 3D reconstruction, behavior analysis and scene understanding, etc. Object detection has also wide application in visual surveillance system, biometrics, human machine interface, content based multimedia retrieval, computing advertisement and driverless car. Although there is a vast literature on this topic, object detection remains a very challenging research problem. A general algorithm pipeline for object detection includes object representation, machine learning and optimization, windows sampling strategy and post-processing. How to build robust object representation is the most important among all these factors. In this thesis, we attempt to address this issue from the aspect of visual structure representation and modeling, and our contributions include: 1)Through reviewing the established work and summarizing people's concept about ``structure", we make the precise definition of visual structure. Moreover, we discuss how to study visual structure representation and modeling and gives the technical roadmap. 2)Inspired from the research in signal processing, scale space theory and the past successful cases, we propose Local Structured Descriptor (LSD). At the system level, we develop a boosted Local Structured Descriptor based topological star model. Based on the proposed method, we made an entry into PASCAL VOC2010 challenge and won the winner prize. 3) To make the topological model be capable of capturing more flexible structure, we propose spatial mixture modeling for part based model. We first reduce the space and time complexity of the model by data decomposition framework, then we discuss the proposed spatial mixture modeling. The proposed spatial mixture model is more robust to multi-view and multi-pose. In 2011, we again participated in PASCAL VOC2011 challenge and won the winner prize again, which indicates the leading role in object detection. 4) The previous work are all based on manually designed topological structure, while our goal is learning structure topology from data. Motivated by this, we propose a framework of data-driven automatic structure learning for object detection. The experimental results show that the developed method can well handle occlusion, deformation and cluttered background. 5) Through the quantitative analysis of the previous methods, we find the recall rate ... |
修改评论