基于特征学习和模型集成的目标跟踪

CASIA OpenIR > 毕业生 > 博士学位论文

	基于特征学习和模型集成的目标跟踪
	朱贵波
	2016-05-27
学位类型	工学博士
中文摘要	目标跟踪技术是计算机视觉的一个经典问题。它是视频内容结构化过程中连接目标检测与高层行为语义的重要桥梁，同时也为高级人工智能所需要的环境感知和行为决策控制提供了重要依据。因此，目标跟踪在智能监控、视频浓缩、人机交互、无人驾驶等现实场景中有着广泛的应用。目标跟踪系统包括目标初始化、表观建模、运动估计和目标定位四个模块。由于感兴趣的目标表观描述常常会受到各种因素（譬如光照变化、严重遮挡、形变、剧烈运动、复杂背景等）的干扰，使得在线目标跟踪变得非常具有挑战性。围绕着跟踪系统各模块面临的主要问题，本文在目标表观视觉表达、多部件统计建模、先验知识启发的检测器与跟踪器模型协同学习和在线聚类分析辅助决策等方面进行了研究，并分别提出了相应的模型和算法，有效地提高了跟踪性能。本文的主要研究工作和贡献概括如下：基于特征蒸馏的目标跟踪。针对当前深度卷积网络提取特征速度慢与目标跟踪需要较高实时性这两者之间的矛盾，本文提出了一种融合模型压缩、特征学习和尺度预测的跟踪方法。在模型压缩的过程中，一种基于教师-学生范式的方法用来作为指导准则，即用大网络监督小网络的拟合训练，从而在测试阶段就可以用小网络快速拟合大网络的中间层特征作为输出特征表达，然后嵌入到相关滤波框架进行目标跟踪。小网络模块也引入了移位-拼接结构对特征提取过程进行加速。此外增加的尺度预测模块也提高了跟踪性能。在公共目标跟踪数据集OTB50和OTB100上的实验结果表明，该方法与当前最好的深度网络跟踪算法相比，性能略有下降，但速度却提升了5倍以上。基于部件上下文学习的目标跟踪。为了有效地挖掘目标跟踪过程中上下文结构关系，本文提出了基于样例-支持向量机选择具有代表性的部件，然后对这些部件构建部件上下文结构学习框架，进而通过结构支持向量机对其进行优化学习。在上下文结构关系建模中主要融合了时空关系的多层次外观表达、先验知识和运动一致性，克服了传统方法在视觉目标表观建模方面的不足。这里，该模型一方面利用了特征、部件和目标等不同层级间的关联信息；另一方面引入层级化上下文图模型结构，挖掘了目标和部件在跟踪过程中的隐含关系，包括其内部或上下文区域各部件所起的作用。实验结果表明上下文结构关系对跟踪器的性能有较大提升，并与其他最好的方法相比在成功率上提升了四个百分点。基于模型协同的目标跟踪。为了解决相关滤波跟踪器在跟踪过程中由于长时间遮挡或消失-再现所引起的漂移问题，本文设计了融合梯度和颜色信息的MC-HOG特征，并基于随机采样生成的在线检测滤波器对整个图像搜索区域进行快速再检测，用检测到的少量相对可靠候选目标区域来增强目标跟踪器对漂移的鲁棒能力。实验结果表明该方法与其他跟踪器相比，性能提升了五个百分点。基于在线聚类的目标跟踪。为了解决在目标跟踪过程中存在的决策模糊问题，本文将在线聚类和模型融合引入目标跟踪系统。在线聚类挖掘观测模型的参数空间和历史目标表观特征空间的潜在群组结构，并通过融合多个弱假设得到一个强分类器预测感兴趣的目标状态。在公共数据集上的实验充分说明了该算法的先进性和有效性。
英文摘要	Object tracking is a fundamental problem in computer vision. It has not only become an important bridge between object detection and video high-level behavioral semantic in organizing video structure, but also provides a significant basis for perceiving environment and action control with decision in advanced artificial intelligence. Therefore, there are many applications including intelligent monitoring, video synopsis, human-computer interaction, automatous driving, where object tracking plays an important role. In general, a typical object tracking system consists of four components: object initialization, appearance modeling, motion estimation, and object location. It is still challenging in handling complex object appearance changes caused by factors such as illumination variation, heavy occlusion, deformation, fast motion, background clutter, \etc. Focusing on the main problems of each module in the object tracking system, the paper proposes relative models and tracking algorithms based on designing and learning object appearance visual representation, multiple parts statistical modeling, collaborative tracking with detector and tracker inspired by prior knowledge and online clustering analysis for assisting on decision. Specifically, to handle the facing problems in tracking algorithms, the paper proposes several solutions to improve the tracking performance significantly. Our works and contributions could be summarized as: (1). Feature distilled tracking. There is a contradiction in time complexity between extracting features using very deep models and object tracking. Because extracting features using very deep models is too expensive in time cost for real-time object tracking. To alleviate this problem, we propose an ensemble method of model compress, feature extraction and scale estimation. In the process of model compression, we propose a novel method with teacher-student paradigm. Specifically, the paper proposes a small feature distilled network for visual tracking by imitating the intermediate representations of a much deeper network. The feature distilled network extracts rich visual features with higher speed than the original deeper network. To further speed-up, the paper introduces a shift-and-stitch method to reduce the arithmetic operations, while preserving the resolution of the distilled feature maps unchanged. Finally, a scale adaptive discriminative correlation filter is learned on the distilled feature for visual tracking to handle the target appearance variation. Experiments on public object visual tracking benchmarks OTB-50 and OTB-100 have showed that compared with the state-of-the-art deep tracking algorithm, the proposed method achieves the comparable performance but much $5$ times faster running speed than the original network. (2).Part context learning for object tracking. Context information is widely used in computer vision for tracking arbitrary objects. The paper first utilizes Examplar-SVM to explore some representative parts and proposes a unified part context learning framework that can effectively capture spatial-temporal relations, prior knowledge and motion consistency to enhance the tracker's performance by overcoming the deficiency in appearance modeling. Firstly, the proposed part context tracker analyzes the interrelated information in hierarchical layers from feature, part and object. Secondly, by introducing hierarchical context graph model structure, we explore the intrinsic relation between parts and object in the tracking process, including each part in the object or the context region. Experiments indicate context structure relations are important for boosting the tracking performance with a gain of 4 percent compared with other trackers in success plots. (3). Collaborative correlation tracking. How to handle the model drift caused by long-term occlusion orout-of-view is still an open problem. The paper proposes a collaborative correlation tracker to deal with the above problems. Firstly, the paper designs MC-HOG feature by exploring gradient and color information. Then a novel long-term detection filter with random sampling for detection is learned efficiently with random sampling to alleviate model drift by detecting effective object candidates in the collaborative tracker. In this way, the proposed approach could estimate the object state accurately by handling the model drift problem effectively. Experimental results show that the proposed method has about five percentage in performance. (4). Clustering ensemble correlation tracking. A key problem in visual tracking is how to effectively solve the decision ambiguities of target appearances with online model update. The paper addresses this problem by incorporating sequential clustering and ensemble methods into the tracking system. In this paper, clustering is used for mining the potential historical structure in the parameter space and feature space. Then the paper fuses multiple weak hypotheses to construct a strong ensemble learner for effective object tracking. Dense experiments show that the proposed method alleviates the model drift problem by exploring group structure with online clustering and boosts the tracking performance.
关键词	目标跟踪部件上下文模型相关滤波在线聚类协同跟踪
语种	其他
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/11754
专题	毕业生_博士学位论文
作者单位	中科院自动化研究所
第一作者单位	中国科学院自动化研究所
推荐引用方式 GB/T 7714	朱贵波. 基于特征学习和模型集成的目标跟踪[D]. 北京. 中国科学院大学,2016.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
基于特征学习和模型集成的目标跟踪.pdf（4159KB）	学位论文		限制开放	CC BY-NC-SA