|Place of Conferral||北京|
|Keyword||行人检测 深度学习 Faster R-cnn 级联分类器 Acf|
1. 针对Faster R-CNN框架难以解决小目标问题，提出了一种基于Faster R-CNN的端到端多尺度模型。该框架不需依赖额外的目标框检测算法和特征提取方式，自主地依据候选框大小学习和选择不同尺度的特征信息。网络结构简洁，易于实现和拓展。
2. 对Faster R-CNN进行了拓展，引入Neural Cascade结构以解决难例选择和假阳性样本过多的问题。利用多层感知机构造多个弱分类器，并组合成强分类器，来预先判断可用候选框。利用选择的难例来训练网络，在降低网络计算量的同时使检测精度进一步改善。
Object detection is an enduring topic in the field of computer vision. As a typical task of object detection, pedestrian detection has significant practical applications in the area of security, autonomous driving, intelligent surveillance and robotics applications. After more than ten years of exploration, pedestrian detection technologies have greatly developed and a series of classic algorithms have been proposed. Especially in recent years, with the introduction of the deep learning methods, the pedestrian detection models have gained surprising improvement. Despite deep learning based models can achieve good performance without designing hand-crafted features, low efficiency and high hardware requirement may limit their practical use. As a consequence, various deep network models which make compromises between speed and accuracy are proposed. Although excellent performance and efficiency has been achieved, pedestrian detection still has a long way to go for the following reasons. Firstly, the pedestrian objects in the road scenes are too small to contain enough feature points. However, most pedestrian detection frameworks perform poorly in detecting small targets. Secondly, the road traffic environment is complex and the pedestrian occlusion problem caused by various interference is serious. Thirdly, pedestrians always have many additional attributes, such as hats, bags, suitcases, etc, which increase the complexity of pedestrian detection.
In this paper, we study both traditional methods and deep learning models in pedestrian detection and integrate them to provide new solution for this task. Overall, the main contributions of this paper can be summarized as follows:
1. In this paper, an end-to-end multi-scale model based on Faster R-CNN is proposed to solve the problem of small target problem in Faster R-CNN framework. The model is free of external proposal detection and feature extraction methods and capable of learning and selecting different scales of feature information based on the size of the candidate boxes. The structure of our model is brief and easy to implement and extend.
2. The model is extended on the basis of Faster R-CNN. A neural cascade structure is proposed to solve the problem of hard negative examples mining and false positive samples. A strong classifier composed of multiple MLP (multi-layer perceptron) based weak classifiers is constructed and applied to judge the available candidate boxes in advance. Meanwhile, we train the network with selected hard examples, which relieves the computational burden and improve the accuracy as well.
3. A variety of network structures are explored and analyzed, and we conduct a series of experiments to evaluate the performance. The end-to-end multi-scale model presented in this paper is fast and performs well in the corresponding evaluation methods of certain open pedestrian detection data sets. Some crucial tricks used in experiments are summarized and analyzed as well.
|童贝. 基于深度学习和传统方法相结合的行人检测研究[D]. 北京. 中国科学院研究生院,2017.|
|Files in This Item:|
|template.pdf（22597KB）||学位论文||暂不开放||CC BY-NC-SA||Application Full Text|
|Recommend this item|
|Export to Endnote|
|Similar articles in Google Scholar|
|Similar articles in Baidu academic|
|Similar articles in Bing Scholar|
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.