实时在线三维头部跟踪的研究及应用

CASIA OpenIR > 毕业生 > 博士学位论文

	实时在线三维头部跟踪的研究及应用
其他题名	Towards Online and Real-Time 3D Head Pose Tracking
	王海波
	2011-06-16
学位类型	工学博士
中文摘要	基于单目摄像机的三维头部跟踪技术在人机交互及虚拟现实中有着重要的应用前景。但在复杂的场景中，实时、连续的三维头部跟踪问题仍是一个难点，在本文中，我们围绕着如何解决这一问题展开了深入的研究。其中，我们的创新性工作主要体现在如下四个方面。首先，提出了一种改进的基于模型梯度差分的三维跟踪方法。该方法利用三维椭球人脸模型，以时间先验做参数预测，通过动态模板匹配完成姿态估计。匹配中的雅可比矩阵通过多尺度差分计算，提高了算法的鲁棒性。通过实验验证，该方法对噪声变化鲁棒且估计的结果平滑真实。其次，提出了一种基于简单人机交互的三维头部真实感建模方法。在该方法中，为了快速地建立真实三维模型，我们采用了一种无网格化的移动最小二乘的实时形变方法，并进一步融入了三维刚性运动与邻近顶点平滑的约束，保证了形变结果的局部性质保持和平滑性。在线的试验结果验证了我们方法的有效性、可靠性和稳定性。再次，提出了一种新的基于局部特征检测和匹配的三维跟踪框架，并针对跟踪过程中存在的问题（如：颤抖、较多的错误匹配等）提出了创新性的解决方法。与已有的基于检测的跟踪框架不同，我们的方法通过结合不变性分析和变换合成的方法，能够自动学习具有透视、局部表情和光照等变化噪声不变性的特征点，并且通过一种新的多视图学习方法，可以自动选择出具有高判别能力的局部特征描述子。大量实验表明，该框架克服了传统的差分框架中存在的两个关键缺陷：低准确度和模型漂移。最后，构建了一个实用的三维头部跟踪系统平台，并给出了在人机交互中的两种应用：第一个是眼神估计，我们把眼神估计和三维头部跟踪有机地结合在一起，提出了一种新的头部转动情况下眼神注视方向的估计方法。第二个是远程在线虚拟现实平台中的表情移植，我们的方法可以将真人的正面表情移植到虚拟的在线角色上，从而大大增加了远程交互的真实感和互动性。
英文摘要	Monocular 3D head tracking is a core technique for designing intelligent computer-human interfaces. Over the last decade, long-term tracking in complex environments remains a challenging problem. In this thesis, we investigate this problem by presenting two alternative frameworks and exploit its potential applications in computer-human interactions. The first framework is a robust implementation of the differential tracking approach along with a 3D ellipsoid for geometric reasoning. It recursively estimates the head pose over time from prior prediction and dynamically updates its template model built beforehand. This makes it robust to appearance changes and lead to smooth estimations. However, they also bring two severe problems: the system can only handle slowly moving targets and the need for updating the model makes it prone to drift, which together make tracking over long periods of time impossible. To avoid these important limitations, the second part of this thesis turns to a novel tracking by detection approach. Compared to the last approach, it requires an offline modeling and learning procedure, and performs tracking without the motion prior and dynamic updating. Tracking is actually made by matching features detected from the input images with the reference ones via a novel multi-view learning scheme. The learning relies on face texture synthesis to produce training examples, stable class detection and multi-view selection that are executed within a simple head modeling system. Extensive experiments show that this prevents drift while successfully tracking natural head motions. To further improve the performances, we also integrate optical-flow correspondences to enforce temporal consistency and incorporate color prior to identify possible outlier features. Finally, fusing all these components leads to a system that can be used for computer-human interactions. Lastly, we present two applications of the proposed 3D head tracking system. The first estimates the user’s gaze direction in the presence of natural head rotations. The second transfers facial expressions from a user to his online avatar. As a result, integrating these two functions into a virtual collaborative system greatly improves the communications of two remote partners.
关键词	三维头部姿态跟踪基于梯度的跟踪基于检测的跟踪特征匹配与学习颜色先验运动颤抖消除眼神估计虚拟现实 3d Head Pose Tracking Gradient-based Tracking Tracking By Detection Feature Matching And Learning Color Prior Motion Jitter Removing Gaze Estimation Virtual Reality
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/6393
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	王海波. 实时在线三维头部跟踪的研究及应用[D]. 中国科学院自动化研究所. 中国科学院研究生院,2011.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
CASIA_20071801462806（8189KB）			暂不开放	CC BY-NC-SA