CASIA OpenIR  > 毕业生  > 博士学位论文
Thesis Advisor谭铁牛
Degree Grantor中国科学院研究生院
Place of Conferral北京
Keyword环境约束 图像分类 步态识别 行人再识别




Other AbstractWith the development of our society, especially the growing dissemination of digital information and the popular of Internet, an increasing number of data is available for computer vision researches. However, even the best computer vision system so far cannot be comparable with the human vision system in many applications. Moreover, a large portion of existing works is of limited consideration, which ignores the connection between data and the environment. Thus, many of them may lead to low-level errors in practice.

This thesis focuses on exploring the environmental constrains in computer vision, especially the depth information, topological structure and spatial-temporal consistency. The main content and contribution of this paper is described as follows.

1. We propose a depth-embedded multiple pooling model for image classification. This model is built on top of the traditional bag-of-words method. We firstly exploit the Markov random field to estimate the depth of each pixel and then embed the depth information into the image features, which map the feature space to a higher one. During pooling, features will be projected to two adjacent depth plane, the benefit of which is that our model can distinguish features that cannot be separated within the feature space but can be classified in the depth direction. In experiments, our model outperforms the tradition bag-of-words method, especially for scene image classification.

2. We propose a topological structure based gait recognition system. The topology is the inherent property of shapes, e.g., no matter how a person's walking pose changes and how he/she dresses, the topology of the gait silhouettes is unchanged. This is the so called topological invariance. Meanwhile, the topological invariance lacks enough discrimination to distinguish objects with similar structures while they belong to different categories. Therefore, we exploit persistent homology to track the topology of data with multiple resolutions and multiple views, which enhances its power in describing local details. The extracted topological features are enough for recognition tasks in computer vision. The experiments demonstrate that our proposed topological features outperform traditional gait features, especially in the case of cross-view and cross-pose gait recognition.

3. We propose a spatial-temporal consistency based person re-identification method. Person re-identification methods generally involve two key steps, namely feature learning and metric learning. Most of previous works focus on one of them. In this paper, feature learning and metric learning are incorporated into an end-to-end deep neural network. Using the temporal attention model, we can measure the importance of each frame in a pedestrian video, which is useful for choosing more informative frames and improving feature learning. The spatial recurrent model is designed to explore contextual information spatially, which has been experimentally demonstrated effective for metric learning.
Document Type学位论文
First Author AffilicationInstitute of Automation, Chinese Academy of Sciences
Recommended Citation
GB/T 7714
周振. 基于环境约束的视觉数据分析[D]. 北京. 中国科学院研究生院,2017.
Files in This Item:
File Name/Size DocType Version Access License
Thesis-name_V2.pdf(11422KB)学位论文 暂不开放CC BY-NC-SAApplication Full Text
Related Services
Recommend this item
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[周振]'s Articles
Baidu academic
Similar articles in Baidu academic
[周振]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[周振]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.