基于深度图像的手势交互技术研究

CASIA OpenIR > 毕业生 > 博士学位论文

	基于深度图像的手势交互技术研究
其他题名	Research on Depth Image-Based Hand Gesture Interaction Techniques
	秦树鑫
	2014-05-25
学位类型	工学博士
中文摘要	随着计算机逐步的智能化，人机交互技术越发重要。并逐渐由人主动适应计算机向计算机适应人类需求的方向发展。手势交互技术是一种以手为核心的直观的人机交互技术。手作为人体最为灵活的部分之一，能够传递各种各样的信息。近些年来，基于视觉的手势交互技术已经在生活、娱乐、教育、医疗和工业生产等诸多领域中得到广泛的应用。深度相机的发展进一步拓展了手势交互技术的应用空间，特别是在教育和娱乐领域，自然的交互方式更容易被人们所接受。然而现有的手势交互技术也存在一些明显的不足，比如，在开放的环境下基于肤色的分割方法容易受光照和复杂背景的影响，大规模训练数据的采集与处理耗费大量人力物力，以及传统的实现方法不能满足实时性的要求等。这些缺点很大程度上限制了该技术的应用空间。因此，要实现一种基于视觉的手势交互技术，使其同时满足自然性、易用性、易扩展性、精确性以及实时性的要求，是一项具有挑战性的课题。本文从手势自然交互的角度出发，在充分了解国内外研究现状的基础上，结合当前软硬件的发展趋势，针对基于深度图像的手势交互技术进行了深入的研究。根据手势传递信息的方式和内容，分别探索了手势动作识别、静态手势识别和手的三维姿态跟踪的相关技术。具体的研究工作包括以下几个方面：（1）针对手势动作识别，提出一种基于单次学习的三维连续手势动作的识别方法。首先，提出一种自适应头部模板跟踪和区域生长相结合的实时人体检测与分割方法；其次，在人体区域分割的基础上，提出一种基于三视图运动历史图像与对应的金字塔方向梯度直方图向量相结合的手势动作表示方法，针对连续的动作序列，提出一种手势动作的提取方法，包括连续动作的分割方法和动作有用帧的提取策略；最后通过图像相关性及向量相关性进行基于单次学习的手势动作识别。该方法适用于三维的手势动作识别，具备不需要训练数据的支撑和易扩展性的优点，比传统单次学习方法具有更高的识别率。（2）针对静态手势识别，提出一种基于几何表示的实时静态手势识别方法。首先，基于深度图像，通过多次分割的方法提取精确的手势区域；其次，改进了传统的凸形状分解方法，提出一种基于半径函数的方法来计算Reeb图，加快了凸形状分解的速度；最后，在凸形状分解的基础上，提出一种基于二维骨架的表示方法对静态手势进行特征描述，并采用模板匹配方法进行识别。该方法采用单一深度图像作为输入，在识别的速度和准确率上均具备明显的优势，同时不需要训练数据的支撑，具备易用性、易扩展性的优点。（3）针对手的三维姿态跟踪与交互及实时性需求，提出一种基于GPU的连续粒子群优化算法，实现了手的三维姿态实时跟踪。首先，定义一种手的三维模型表示方法；其次，改进了传统的粒子群优化算法，将连续优化策略加入优化过程中，进行连续跟踪，将粒子重采样技术引入到粒子群中，提高了优化跟踪的精度；再次，将粒子群优化算法通过CUDA来实现，加快了粒子适应值的计算和粒子群的更新、采样等操作的速度；最后，采用基于OpenGL的多视口并行绘制方法和几何实例化绘制技术，加快了三维手模型的绘制速度...
英文摘要	With the development of the computer intelligence, human-computer interaction technology has become increasingly important. Gesture interaction technology is a hand-centered intuitive technology. Human hand, considered as one of the most flexible parts, can express various visual messages. Recently, vision-based gesture interaction technology has been widely used in living, entertainment, education, health and industry. The development of depth camera further increases the applications with gesture interaction technology, especially in education and entertainment, because natural interaction is more acceptable. However, there are still some deficiencies. Firstly, color-based segmentation is easy to be influenced by the illumination and complex background. Secondly, the collecting and processing large-scale training data cost a lot of manpower and resources. Thirdly, traditional implementation methods cannot meet the need of real-time applications. So, it is a challenging task to develop vision-based gesture interaction technologies with the features including usability, expansibility, accuracy and real-time performance. On the basis of fully understanding the research status and the development trend of the current hardware and software, this dissertation studies the hand gesture interaction technology based on depth images from the perspective of natural interaction. According to the mode and content of the hand gesture interaction technology, hand gesture recognition, hand posture recognition and 3D hand tracking have been studied deeply. The main work of this dissertation is summarized as follows: (1) As for hand gesture recognition, a 3D gesture recognition method based on one-shot-learning is proposed in consideration of the spatial-temporal features of the gesture. Firstly, a template-based self-adaptive head tracking method combined with a region growing approach is presented for human detection and segmentation. Secondly, gesture is represented by two features: 3 views motion history images and their pyramid histogram of oriented gradient vectors. Thirdly, an action selection method including action segmentation and informative frame selection is proposed for successive gestures. Finally, the proposed action selection and representation methods are employed together for one-shot-learning gesture recognition. The correlations of images and vectors are employed for recognition. This method has a higher recognition rate, needs no training data and...
关键词	深度图像手势识别运动历史图像凸形状分解三维跟踪粒子群优化并行计算 Depth Image Gesture Recognition Motion History Images Convex Shape Decomposition 3d Tracking Particle Swarm Optimization Parallel Computing
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/6598
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	秦树鑫. 基于深度图像的手势交互技术研究[D]. 中国科学院自动化研究所. 中国科学院大学,2014.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
CASIA_20111801462908（12943KB）			暂不开放	CC BY-NC-SA