英文摘要 | Visual tracking is a fundamental problem in computer vision with a wide range of applications such as human-computer interaction, visual odometry, automated surveillance, video analysis and augmented reality, to name a few. Different from the kind of model-based tracking approach for specific objects (e.g., people, vehicles), a kind of model-free online visual tracking approach can track arbitrary objects without any prior knowledge about them, relying on just a single annotation of each target object and adapting to its appearance variations. With the rise of glasses-free 3D technology, intelligent UAV technology and other human-computer interaction technologies, the industry demand for the robust model-free tracking is becoming more urgent. However, this kind of tracking approach still remains a challenging task, because it needs to adaptively update the appearance model on-the-fly to account for the object appearance variations (e.g., illumination, occlusion, pose deformation), and it is prone to have the objects lost when their appearances change drastically, resulting in drift problems. In this thesis, we surveyed the recent improvement of the model-free online visual tracking literature in alleviating drift, and propose to apply the graph-based learning methods (classification/regression) to online visual tracking by novelly introducing semi-supervised learning and transfer learning techniques, based on the thoughts of manifold assumption and cluster assumption in the graph-based semi-supervised learning theory. We achieve great progress in alleviating drift, and the effectiveness is experimentally validated. The main contributions of this thesis include: (1) We propose a new sparse representation based semi-supervised graph embedding learning method and apply it to online visual tracking. Specifically, we firstly exploit the covariance matrix descriptor to achieve a new representation of the graph vertex, so as to avoid the curse of dimensionality problem in graph embedding and improve the tracking robustness. Secondly we specially design two graphs for characterizing both the intrinsic local geometrical structure and the separability of the labeled training samples for graph embedding, according to their distribution property. Thirdly, we construct an adjacency graph for the vertices of all the labeled and unlabeled samples based on the sparse representation theory, so as to explore higher order relationships among all the training sa... |
修改评论