基于结构模型的物体检测

CASIA OpenIR > 毕业生 > 博士学位论文

	基于结构模型的物体检测
其他题名	Structural models for object detection
	闫俊杰
	2015-05-30
学位类型	工学博士
中文摘要	物体检测是从图片或者视频中判断“什么物体在什么地方”的计算机视觉问题。长久以来，物体检测被认为是计算机视觉高层语义分析中的核心问题，也是其他诸多应用问题的基础，如图片搜索、人脸识别、目标跟踪以及行为识别等。同时，物体检测技术的研究也极大促进了中、底层计算机视觉技术的发展。考虑到真实视觉中的三维物体投影到图像或者视频带来的信息损失和传感器引入的系统误差和随机误差，以及物体本身的类别、视角、形变、光照、遮挡等各种因素造成的物体表象变化，使得物体检测成为一个极富挑战性的研究课题。与此同时，数据采集和存储、计算资源以及机器学习算法的发展，为物体检测提供了诸多的方法和机遇。形变部件模型是物体检测领域的代表性算法之一。该模型用树形结构连接全局模板与可变部件模板来表征物体，从全局和局部两方面建模物体表象变化。形变部件模型极大地促进了物体检测领域的发展，其衍化出的诸多方法在物体检测基准数据集上取得了很大的性能提升。本文以形变部件模型为基础，从模型表征、模型学习、模型推断以及检测结果后处理四个方面扩展和完善物体检测领域的形变部件模型。此外，结合结构学习与深度学习，本文提出基于超像素标注的物体检测算法。本文的主要工作如下：（1）在物体的表征模型方面，把原有的参数模型扩展为非参数模型与参数模型结合的方式，以此来处理更大的物体形变。具体地，提出了基于表象回归的形变部件模型，以及不同表象回归形变部件模型的级联增强。（2）在学习算法方面，提出了多任务的部件形变模型方法，来处理检测中不同分布的样本。具体地，首次提出多任务多分辨率模型通过联合学习分辨率相关的特征变换矩阵以及分辨率无关的共享分类器，来解决物体检测中的多分辨率问题。（3）在推断算法方面，分析了形变部件模型不同推断算法的速度瓶颈，并从三个方面极大地加速推断速度。具体地，提出了学习鉴别性低秩卷积核，邻域共享的级联以及基于查找表的快速梯度直方图特征计算方式。（4）在检测结果后处理方面，提出了建模整张图像上下文的方式来得到更加符合场景一致性的检测结果。具体地，建模拥挤场景中的表象与不同物体之间的空间关系，从而有效地推断被遮挡的物体。（5）提出全新的基于超像素标注的物体检测算法。在深度学习得到的表象特征的基础上，通过推断一个描述超像素的表象、超像素的空间关系等因素的能量函数来得到超像素属于某个类别的某个物体，进而得到检测结果。相比于传统方法，该方法可以得到更大的物体候选区域灵活性并且自然地利用图像的全局信息。在完成物体检测的同时，该方法可以进一步输出物体分割结果。本文从上述五个方面推动了通用物体检测、行人检测、人脸检测、人脸关键点定位、目标部件定位等方面的发展。在通用物体检测方面，结合深层卷积网络，在ImageNet通用物体检测任务上超越了Google最新研发的GoogLeNet检测系统性能。在人脸关键点定位和目标部件定位方面，取得了300-W人脸关键点定位比赛冠军，在LSP人体姿态估计上取得了领先性能。在行人检测方面，在Caltech行人数据集上比之...
英文摘要	Object detection is a computer vision task to find ``which objects are where'' in image or video. Object Detection has been a central problem in high level computer vision and also serves as the bases for other problems such as image search, face recognition, tracking and action recognition. The research on object detection also promotes the low level and middle level computer vision, such as feature representation. Considering the information loss when the 3D object in real world is projected to the 2D image or video and the system error and random error introduced by the sensor, as well as appearance variations from the category, pose, deformation, illumination and occlusion, object detection is very challenging. Meanwhile, the improvement from data, computing infrastructure and machine learning also provides many opportunities for object detection. The deformable part model has been a very popular structural model in object detection. It uses the star model to connect the root template and deformable part template to represent the object. Deformable part model has largely improved the object detection, and a lot of performance gain is achieved based on deformable part model. In this paper, we improve the deformable part model in representation, learning, inference and post-processing. Additionally, we propose the superpixel labeling based method for object detection, which connects the deep learning and structural learning. The contributions of the paper are listed as follows. 1. For representation, we extend the parametric model to be a joint parametric and non-parametric model, in order to capture large deformable of objects in real world. Specifically, we propose a shape regression based deformable part model and its stacked version. 2. For parameter learning, we propose a multi-task deformable part model, in order to handle samples from different distribution. Specifically, we jointly learn the distribution aware feature transform and shared detector in the distribution invariant space to handle samples from different resolutions. 3. For inference, we find three bottlenecks and significantly accelerate it in the three aspects. Specifically, we propose discriminative low rank filter learning, neighborhood aware cascade and lookup table based HOG feature computation. 4. For post-processing, we use the global image context, in order to find the most consistent detection hypothesis with the scene. Specifically, we propose to model the spat...
关键词	物体检测行人检测人脸检测与关键点定位形变部件模型 Object Detection Pedestrian Detection Face Detection And lAndmark Localization Deformable Part Model
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/6735
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	闫俊杰. 基于结构模型的物体检测[D]. 中国科学院自动化研究所. 中国科学院大学,2015.