CASIA OpenIR  > 毕业生  > 博士学位论文
基于RGB-D数据的目标跟踪方法研究
安宁1,2
学位类型工学博士
导师侯增广
2017-05
学位授予单位中国科学院大学
学位授予地点北京
关键词目标跟踪 Rgb-d数据 仿生视觉 数据融合 数据驱动控制
摘要
目标跟踪作为计算机视觉领域的研究热点之一,在智能监控、无人驾驶、人机交互与增强现实等领域具有广泛的应用。经过十几年的研究,目标跟踪技术已经取得了长足的进展。然而由于光照变化、严重遮挡、背景杂乱与大幅形变等干扰因素的存在,使得复杂场景中的鲁棒目标跟踪仍是一项极具挑战性的任务。近年来,RGB-D传感器技术快速发展,其采集数据中蕴含的三维信息能够显著降低传统彩色图像信息的歧义性。本文利用RGB-D数据克服传统跟踪算法的局限性,针对基于RGB-D数据的目标跟踪算法进行研究。
目标跟踪算法由四个基本单元构成:表示模型、模型更新、目标搜索与误差估计。表示模型单元用于描述被跟踪目标,模型更新单元用于对表示模型更新,目标搜索单元用于搜索表示模型的最匹配目标,误差估计单元用于评价搜索结果的有效性。在跟踪过程中,以上四个单元紧密关联、协同工作。本文的RGB-D目标跟踪方法研究围绕以上四个单元依次进行,并在最具影响力的大型RGB-D目标跟踪公开数据集Princeton Tracking Benchmark中对算法的有效性进行深入地验证与分析。论文的主要工作和贡献如下:
(1) 提出了一种基于检测-学习-分割的RGB-D目标跟踪算法,以解决目标跟踪中的误差估计问题。该算法将目标跟踪分解为检测、学习、分割三个部分:检测器利用基于核化相关滤波器的二维表观模型定位目标;分割器利用基于自适应深度直方图的三维分布模型定位目标;学习器利用潜在失败与遮挡干扰项判别来估计检测器与分割器的定位误差,并在线更新目标模型。实验结果证明了算法创新点的有效性,其性能优于其他代表性RGB-D目标跟踪算法。
(2) 提出了一种基于认知心理学记忆模型的RGB-D目标跟踪算法,以解决目标跟踪中的模型更新问题。该算法将人类认知心理学记忆机制迁移到模型更新中,将表示模型分解为感觉记忆模型、短时记忆模型与长时记忆模型:感觉记忆模型通过立体感知与注意机制获取场景信息;短时记忆模型具有高度可塑性,利用记忆复述机制在线更新;长时记忆模型具有高度稳定性,利用记忆编码与检索机制在线更新。在跟踪过程中,上述模型根据目标的表观变化情况协同更新,解决了模型更新中的稳定性-可塑性困境。实验结果证明了算法创新点的有效性,其性能优于其他代表性RGB-D目标跟踪算法。
(3) 提出了一种基于三维多跟踪器融合的RGB-D目标跟踪算法,以解决目标跟踪中的表示模型构建问题。该算法利用能量函数最优化将多个具有单一表示模型的基础跟踪器在三维空间中进行融合,通过对融合结果的三维立方体吸引性与三维轨迹平滑性进行度量,以在不同场景中分别发挥不同模型的优势。同时设计了三个具有差异化表示模型的基础跟踪器,以在融合机制中产生互补的跟踪效果。实验结果证明了算法创新点的有效性,其性能优于其他代表性RGB-D目标跟踪算法。
(4) 提出了一种基于三维相似物判别的RGB-D目标跟踪算法,以解决目标跟踪中的目标搜索问题。该算法将目标跟踪中基于目标-背景判别的搜索定位转化为更精细的基于目标-相似物判别的搜索定位。首先利用三维相似物采样方法快速获取候选样本,在采样过程中滤除表观相似干扰。然后利用三维相似物判别方法对候选样本进行目标-相似物分类,在保证场景信息得到充分利用的前提下滤除背景干扰。算法中的三维相似物度量标准与三维判别模型在线更新,以保证算法对目标变化的适应能力。实验结果证明了算法创新点的有效性,其性能优于其他代表性RGB-D目标跟踪算法。
(5) 提出了一个基于RGB-D数据的机器人目标跟踪系统,并对其中的RGB-D目标跟踪、目标跟踪控制两个关键问题进行研究。提出了一种基于环境上下文信息的RGB-D在线目标再识别算法。该算法根据环境上下文的实时判别性对目标子模型的权重在线自适应调整,以保证其在不同场景中的有效性。提出了一种基于端到端高斯过程回归学习的目标跟踪控制算法。该算法利用非参数机器学习方法对人类目标跟踪控制规律进行端到端的学习,使机器人获得仿人的智能化目标跟踪控制方式。所提算法均在真实机器人目标跟踪系统中实现,其鲁棒性与有效性在不同的实际应用场景中得到了验证。
其他摘要
Visual object tracking is an important problem in computer vision and has many applications including intelligent surveillance, unmanned driving, human-computer interface, and augmented reality. Despite significant progresses in the last decades, robust visual object tracking in complex scenarios is still challenging due to illumination variation, severe occlusion, background clutter, and large deformation. Recently, RGB-D sensor technology has been rapidly developed, the ambiguity in traditional color images can be significantly decreased with the three-dimensional information in RGB-D data. Therefore, in this dissertation we take advantage of RGB-D data to overcome the limitation of traditional tracking algorithms.
Visual object tracking algorithm can be decomposed into four base components: representation model, model update, target search, and error estimation. The representation model component is used to describe the target; the model update component is used to update the representation model; the target search component is used to find the target by representation model matching; the error estimation component is used to evaluate the tracking results. During tracking, the four components work in a cooperative manner. Our research on RGB-D tracking focus on the four tracking components respectively. Moreover, we evaluate the proposed algorithms on the large-scale Princeton Tracking Benchmark, which is the most influential large-scale RGB-D tracking dataset. The main work and contributions are as follows:
(1) A detection-learning-segmentation based RGB-D tracking algorithm is proposed to solve the error estimation problem. The algorithm decomposes the tracking task into detection, learning, and segmentation: detection locates the target with kernelized correlation filter based 2D appearance model; segmentation locates the target with adaptive depth histogram based 3D distribution model; learning estimates the location errors of detection and segmentation with potential failure and occlusion judgement, updates the target models from most confident frames. Experimental results demonstrate the effectiveness of the proposed tracker, show its superior performance against other state-of-the-art RGB-D trackers.
(2) A cognitive psychological memory model based RGB-D tracking algorithm is proposed to solve the model update problem. The algorithm transfers the cognitive psychological memory mechanism into model update, decomposes the representation model into sensory memory register, short-term memory model, and long-term memory model: the sensory memory register collects scenario information by 3D perception and memory attention; the highly plastic short-term memory model is updated by memory rehearsal; the highly stable long-term memory model is updated by memory encoding and retrieval. During tracking, the three memory models are updated according to the target appearance changes to solve the stability–plasticity dilemma. Experimental results demonstrate the effectiveness of the proposed tracker, show its superior performance against other state-of-the-art RGB-D trackers.
(3) A 3D tracker-level fusion based RGB-D tracking algorithm is proposed to solve the representation model building problem. The algorithm fuses the base trackers with simple representation models in the 3D space by energy function optimization. Both 3D cube attraction and 3D trajectory smoothness are measured to enhance the strengths of different trackers in various scenarios. In addition, three base RGB-D trackers with intrinsically different tracking components are proposed for the fusion algorithm to achieve complementary performance. Experimental results demonstrate the effectiveness of the proposed tracker, show its superior performance against other state-of-the-art RGB-D trackers.
(4) A 3D instance-specific proposal discrimination based RGB-D tracking algorithm is proposed to solve the target search problem. Instead of search by traditional target-background discrimination, the algorithm searches the target by more precise target-proposal discrimination. The search candidates are selected by 3D instance-specific proposal sampling, in which the appearance similarity distractor is filtered out. Then the candidates are evaluated by target-proposal discrimination, in which the background noise is filtered out. The 3D proposal metric and target 3D model are updated online to adapt appearance changes. Experimental results demonstrate the effectiveness of the proposed tracker, show its superior performance against other state-of-the-art RGB-D trackers.
(5) A RGB-D data based robot tracking system is proposed, two key problems of the system, namely RGB-D target tracking and target tracking control are addressed. A context-based RGB-D object re-identification algorithm is proposed, weights of target sub-models are adjusted according to the discriminability of the context in the environment. An end-to-end Gaussian process regression based tracking control algorithm is proposed, the algorithm transfers the highly intelligent human control experiences to robot by using non-parametric machine learning. The two proposed algorithms are implemented in real robot tracking system, their robustness and effectiveness are demonstrated in various scenarios. 
文献类型学位论文
条目标识符http://ir.ia.ac.cn/handle/173211/14851
专题毕业生_博士学位论文
作者单位1.中国科学院自动化研究所
2.中国科学院大学
推荐引用方式
GB/T 7714
安宁. 基于RGB-D数据的目标跟踪方法研究[D]. 北京. 中国科学院大学,2017.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
博士论文-基于RGB-D数据的目标跟踪方(24890KB)学位论文 暂不开放CC BY-NC-SA请求全文
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[安宁]的文章
百度学术
百度学术中相似的文章
[安宁]的文章
必应学术
必应学术中相似的文章
[安宁]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。