CASIA OpenIR  > 毕业生  > 硕士学位论文
面向交互的无标记三维人体姿态估计研究
吴泽烨
Subtype硕士
Thesis Advisor车武军
2019-05-23
Degree Grantor中国科学院大学
Place of Conferral北京
Degree Discipline计算机应用技术
Keyword三维人体姿态 关节旋转角度 深度残差网络 角色动作
Abstract

人体姿态估计是自然人机交互的一个重要基础手段,使人们能够利用的自身的认知、感知能力来控制计算机,在虚拟世界中进行自然的互动;它也是智能人机交互的技术基础,计算机可以通过对人体姿态的分析识别来理解其中的语义信息,对用户做出更智能的回应。现有的人体姿态捕捉系统大部分为有标记或穿戴式系统,会对用户造成侵入感,难以普及到大众。而为了实现自然的交互,无标记的人体姿态捕捉是发展的趋势。因此,本文以面向交互的无标记三维人体姿态估计方法为研究对象,包括人体关节旋转角度的恢复、从图像序列中恢复三维人体姿态等,并在此基础上研发了一套虚拟角色交互控制系统。
本文主要贡献如下:
1. 提出了一种由人体关节点坐标恢复人体关节旋转角度的方法。
关节点位置是一种稀疏的人体姿态表达,包含的信息量有限。在许多应用场景中不仅需要使用关节点的位置信息,还需要用到关节旋转角度等更丰富的人体姿态信息。
针对该问题,本文提出了一种由关节点三维坐标恢复关节旋转角度的方法。该方法首先从动作捕捉数据集中重建出人体模型,基于这些人体模型合成不同人物动作下的人体关键点和人体关节旋转的对应数据集,再构建残差网络从数据集中学习人体关键点与关节旋转角度之间的隐含先验关系。该方法构建的网络模型能够从稀疏关节点位置信息中有效地恢复自然的关节旋转角度,为提升已有的关节点估计工作的结果或是融合不同标注格式的人体姿态数据集提供了方法。
2. 提出了一种针对视频图像序列的三维人体姿态估计方法。
本方法将针对图像序列的姿态估计任务分解为两部分:针对单帧图像的人体姿态估计和针对图像序列的姿态变化估计。本方法对两个子问题分别建立模型,并通过条件判断将两种模型结合到一个框架下,得到针对视频图像序列的人体姿态估计方法。对基于单帧图像的姿态估计模型,本方法引入了一种基于批次姿态差异损失的训练方式,提高输出姿态的精度;对于估计图像之间姿态变化的问题,本方法在简化场景条件下引入了帧间差分图和姿态相关性,对输出的姿态变化大小进行约束,从而较为准确地预测相关的两帧图像之间的姿态变化。
3. 将人体姿态估计方法应用到虚拟人交互场景下。结合上述研究,基于多视角的Kinect人体数据融合的方法建立虚拟人动作控制系统,通过捕捉的人体姿态驱动虚拟角色,用户可以实时交互的方式控制角色动作。

Other Abstract

Human pose estimation is a fundamental technique for natural human-computer interaction, which allows people to control the computers using their own cognition and perceptions and to interact with others naturally in the virtual world.It is also the technical foundation of intelligent human-computer interaction, because computers can catch semantic information of user motion and make a smart response by recognizing and analysing human poses. However, most of current human motion capture systems are marker-based or in need of wearable sensors, which makes them invasive and hard to popularize. For natural interaction, marker-less human pose capture is the trend of development.
Therefore, the paper is focused on marker-less 3D human pose estimation for interactive applications, including recovering human joint rotation angles and estimating 3D human pose from image sequence. On the basis of the researches, the paper implements a interactive control system for virtual character. 
The main contributions of the paper are as follows:
1. A method of recovering human joint angles from human joint positions. Human joint position is a sparse description of human pose and contains limited information. In many applications, human joint positions can not fully represent human poses and other information is needed including human joint angles. To solve this problem, this paper propose a method of recovering human joint angles from human joint positions. The proposed method synthesizes training datasets of corresponding human joint positions and joint angles under different human poses, based on reconstructed human pose models from motion capture datasets. Then it constructs residual network to learn the hidden prior relation between joint positions and joint angles. The learned network can recover natural and valid poses from sparse human joint positions, which provides an approach to lifting the results of existing methods for joint position estimation and to fusing human pose datasets with different formats of annotations.
2. A method of human pose estimation from video sequences. This paper decompose the initial task into two parts, estimation human pose estimation from monocular images and estimating the change of human poses between two frames. Two models are constructed for the two sub-problems respectively and then combined into one running framework by applying some conditions. A loss of batch pose differences is introduced for the model based on monocular images. It improves the accuracy of the  output poses. For estimating pose changes between image frames, the frame-to-frame difference and pose relevance are introduced into the model, which constrain the degree of pose changes so that the model is able to predict the pose changes between two consecutive frames accurately. 
3. An instance of applying human pose estimation technique to virtual character interactive scenarios. A real-time interactive system for character motion control is built based on multi-view kinect human skeleton data and the methods proposed above. The motions of characters are driven by captured human poses. Users can control virtual characters with their own actions in real-time.

Pages90
Language中文
Document Type学位论文
Identifierhttp://ir.ia.ac.cn/handle/173211/23898
Collection毕业生_硕士学位论文
Recommended Citation
GB/T 7714
吴泽烨. 面向交互的无标记三维人体姿态估计研究[D]. 北京. 中国科学院大学,2019.
Files in This Item:
File Name/Size DocType Version Access License
Thesis.pdf(2779KB)学位论文 暂不开放CC BY-NC-SAApplication Full Text
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[吴泽烨]'s Articles
Baidu academic
Similar articles in Baidu academic
[吴泽烨]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[吴泽烨]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.