基于图学习的半监督在线视觉跟踪研究

CASIA OpenIR > 毕业生 > 博士学位论文

	基于图学习的半监督在线视觉跟踪研究
其他题名	Graph-based Semi-supervised Learning for Online Visual Tracking
	高晋
	2015-05-28
学位类型	工学博士
中文摘要	视觉跟踪是计算机视觉领域一个基本的研究课题，它在人机交互、视觉导航、自主监控、视频分析、增强现实等领域有着潜在广阔的应用前景。与基于特定目标检测（比如行人、车辆等）的模型固定式视觉跟踪方法不同，一种被称为模型非固定式在线视觉跟踪的方法可以在仅有待跟踪目标初始位置标注信息的情况下通过自适应于目标表观变化来完成任意目标的跟踪。随着近些年裸眼3D技术、智能无人机技术以及其它人机交互技术的兴起，工业界对模型非固定式在线视觉跟踪方法的需求越来越迫切。但是，这种类型的视觉跟踪面临的挑战依然巨大，主要是因为它需要在线更新表观模型来适应目标表观存在的各种变化（例如光照变化、遮挡、姿态形变等），特别是在目标表观发生剧烈变化的时候，跟踪算法很容易丢失目标。本文借鉴了基于图的半监督学习理论中流形假设和聚类假设的思想，并结合当前主流在线视觉跟踪算法在解决目标表观变化导致的跟踪漂移问题方面的研究进展，提出了将基于图的分类或回归学习方法通过半监督学习和迁移学习思想的引入来应用到在线视觉跟踪中，在解决跟踪漂移问题方面取得了很不错的效果，有效提升了跟踪算法的鲁棒性。主要的工作和贡献有：（1）提出了一种基于稀疏表示的半监督图嵌入学习算法并将它应用于在线视觉跟踪。具体说来：首先，我们利用协方差矩阵特征改进了图结点特征描述方式，在有效提升跟踪鲁棒性的同时避免了图嵌入学习中维度灾难的发生；其次，针对判别性跟踪中收集的有标签训练样本的分布特点，我们设计了专门的具有局部结构保留特性和增强的不同类别样本之间判别性的图结构用于图嵌入判别学习；再次，我们基于稀疏表示构建了包含所有有标签和未标签样本结点的邻接图，用于刻画所有样本点之间更高阶的相似度关系，并在聚类假设的准则下提出了新的半监督正则化项；最后，我们对我们所提出的方法进行了非线性扩展，来适应样本分布为多模态、线性不可分的情况。我们在多个具有挑战性的视频序列上验证了这些创新点在单目标跟踪应用中的有效性。我们又将该方法扩展应用到在线多个目标交互的跟踪应用中，与其它一些在线多目标跟踪算法对比也取得了更好的效果。（2）提出了一种基于半监督提升的张量化图嵌入判别性在线单目标跟踪算法。我们在图嵌入判别学习中引入了样本结点张量化表示特性，它可以最大限度保留图片样本原始的多维结构信息，并在判别学习过程中自主提取判别性特征用于后续未标签样本中目标样本的“优中选优”。考虑到在张量化的图嵌入判别学习中，处于多维结构上的判别信息并不能得到充分利用，而且也不能简单依靠加入正则化项来进行半监督扩展，我们提出了基于半监督提升的分类器集成的方法，并在其中引入了迁移学习的思想，使得半监督提升之后获得的集成的分类器不仅可以弥补判别信息利用不充分的问题，而且能在迁移学习框架下利用辅助样本集的信息来刻画早期帧时刻目标表观变化的判别信息，从而能够尽可能地避免目标表观变化导致的跟踪漂移问题。我们在目前比较流行的视觉跟踪测试库上验证了我们算法的有效性，并通过实验分析了各部分创...
英文摘要	Visual tracking is a fundamental problem in computer vision with a wide range of applications such as human-computer interaction, visual odometry, automated surveillance, video analysis and augmented reality, to name a few. Different from the kind of model-based tracking approach for specific objects (e.g., people, vehicles), a kind of model-free online visual tracking approach can track arbitrary objects without any prior knowledge about them, relying on just a single annotation of each target object and adapting to its appearance variations. With the rise of glasses-free 3D technology, intelligent UAV technology and other human-computer interaction technologies, the industry demand for the robust model-free tracking is becoming more urgent. However, this kind of tracking approach still remains a challenging task, because it needs to adaptively update the appearance model on-the-fly to account for the object appearance variations (e.g., illumination, occlusion, pose deformation), and it is prone to have the objects lost when their appearances change drastically, resulting in drift problems. In this thesis, we surveyed the recent improvement of the model-free online visual tracking literature in alleviating drift, and propose to apply the graph-based learning methods (classification/regression) to online visual tracking by novelly introducing semi-supervised learning and transfer learning techniques, based on the thoughts of manifold assumption and cluster assumption in the graph-based semi-supervised learning theory. We achieve great progress in alleviating drift, and the effectiveness is experimentally validated. The main contributions of this thesis include: (1) We propose a new sparse representation based semi-supervised graph embedding learning method and apply it to online visual tracking. Specifically, we firstly exploit the covariance matrix descriptor to achieve a new representation of the graph vertex, so as to avoid the curse of dimensionality problem in graph embedding and improve the tracking robustness. Secondly we specially design two graphs for characterizing both the intrinsic local geometrical structure and the separability of the labeled training samples for graph embedding, according to their distribution property. Thirdly, we construct an adjacency graph for the vertices of all the labeled and unlabeled samples based on the sparse representation theory, so as to explore higher order relationships among all the training sa...
关键词	在线视觉跟踪图嵌入学习半监督学习迁移学习张量化图嵌入高斯过程回归半监督提升 Online Visual Tracking Graph Embedding Learning Semi-supervised Learning Transfer Learning Tensorised Graph Embedding Gaussian Processes Regression Semiboost
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/6719
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	高晋. 基于图学习的半监督在线视觉跟踪研究[D]. 中国科学院自动化研究所. 中国科学院大学,2015.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
CASIA_20121801462803（9787KB）			暂不开放	CC BY-NC-SA