CASIA OpenIR  > 毕业生  > 硕士学位论文
面向交互的无标记三维人体姿态估计研究
吴泽烨
2019-05-23
页数90
学位类型硕士
中文摘要

人体姿态估计是自然人机交互的一个重要基础手段,使人们能够利用的自身的认知、感知能力来控制计算机,在虚拟世界中进行自然的互动;它也是智能人机交互的技术基础,计算机可以通过对人体姿态的分析识别来理解其中的语义信息,对用户做出更智能的回应。现有的人体姿态捕捉系统大部分为有标记或穿戴式系统,会对用户造成侵入感,难以普及到大众。而为了实现自然的交互,无标记的人体姿态捕捉是发展的趋势。因此,本文以面向交互的无标记三维人体姿态估计方法为研究对象,包括人体关节旋转角度的恢复、从图像序列中恢复三维人体姿态等,并在此基础上研发了一套虚拟角色交互控制系统。
本文主要贡献如下:
1. 提出了一种由人体关节点坐标恢复人体关节旋转角度的方法。
关节点位置是一种稀疏的人体姿态表达,包含的信息量有限。在许多应用场景中不仅需要使用关节点的位置信息,还需要用到关节旋转角度等更丰富的人体姿态信息。
针对该问题,本文提出了一种由关节点三维坐标恢复关节旋转角度的方法。该方法首先从动作捕捉数据集中重建出人体模型,基于这些人体模型合成不同人物动作下的人体关键点和人体关节旋转的对应数据集,再构建残差网络从数据集中学习人体关键点与关节旋转角度之间的隐含先验关系。该方法构建的网络模型能够从稀疏关节点位置信息中有效地恢复自然的关节旋转角度,为提升已有的关节点估计工作的结果或是融合不同标注格式的人体姿态数据集提供了方法。
2. 提出了一种针对视频图像序列的三维人体姿态估计方法。
本方法将针对图像序列的姿态估计任务分解为两部分:针对单帧图像的人体姿态估计和针对图像序列的姿态变化估计。本方法对两个子问题分别建立模型,并通过条件判断将两种模型结合到一个框架下,得到针对视频图像序列的人体姿态估计方法。对基于单帧图像的姿态估计模型,本方法引入了一种基于批次姿态差异损失的训练方式,提高输出姿态的精度;对于估计图像之间姿态变化的问题,本方法在简化场景条件下引入了帧间差分图和姿态相关性,对输出的姿态变化大小进行约束,从而较为准确地预测相关的两帧图像之间的姿态变化。
3. 将人体姿态估计方法应用到虚拟人交互场景下。结合上述研究,基于多视角的Kinect人体数据融合的方法建立虚拟人动作控制系统,通过捕捉的人体姿态驱动虚拟角色,用户可以实时交互的方式控制角色动作。

英文摘要

Human pose estimation is a fundamental technique for natural human-computer interaction, which allows people to control the computers using their own cognition and perceptions and to interact with others naturally in the virtual world.It is also the technical foundation of intelligent human-computer interaction, because computers can catch semantic information of user motion and make a smart response by recognizing and analysing human poses. However, most of current human motion capture systems are marker-based or in need of wearable sensors, which makes them invasive and hard to popularize. For natural interaction, marker-less human pose capture is the trend of development.
Therefore, the paper is focused on marker-less 3D human pose estimation for interactive applications, including recovering human joint rotation angles and estimating 3D human pose from image sequence. On the basis of the researches, the paper implements a interactive control system for virtual character. 
The main contributions of the paper are as follows:
1. A method of recovering human joint angles from human joint positions. Human joint position is a sparse description of human pose and contains limited information. In many applications, human joint positions can not fully represent human poses and other information is needed including human joint angles. To solve this problem, this paper propose a method of recovering human joint angles from human joint positions. The proposed method synthesizes training datasets of corresponding human joint positions and joint angles under different human poses, based on reconstructed human pose models from motion capture datasets. Then it constructs residual network to learn the hidden prior relation between joint positions and joint angles. The learned network can recover natural and valid poses from sparse human joint positions, which provides an approach to lifting the results of existing methods for joint position estimation and to fusing human pose datasets with different formats of annotations.
2. A method of human pose estimation from video sequences. This paper decompose the initial task into two parts, estimation human pose estimation from monocular images and estimating the change of human poses between two frames. Two models are constructed for the two sub-problems respectively and then combined into one running framework by applying some conditions. A loss of batch pose differences is introduced for the model based on monocular images. It improves the accuracy of the  output poses. For estimating pose changes between image frames, the frame-to-frame difference and pose relevance are introduced into the model, which constrain the degree of pose changes so that the model is able to predict the pose changes between two consecutive frames accurately. 
3. An instance of applying human pose estimation technique to virtual character interactive scenarios. A real-time interactive system for character motion control is built based on multi-view kinect human skeleton data and the methods proposed above. The motions of characters are driven by captured human poses. Users can control virtual characters with their own actions in real-time.

关键词三维人体姿态 关节旋转角度 深度残差网络 角色动作
语种中文
七大方向——子方向分类计算机图形学与虚拟现实
文献类型学位论文
条目标识符http://ir.ia.ac.cn/handle/173211/23898
专题毕业生_硕士学位论文
推荐引用方式
GB/T 7714
吴泽烨. 面向交互的无标记三维人体姿态估计研究[D]. 北京. 中国科学院大学,2019.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
Thesis.pdf(2779KB)学位论文 限制开放CC BY-NC-SA
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[吴泽烨]的文章
百度学术
百度学术中相似的文章
[吴泽烨]的文章
必应学术
必应学术中相似的文章
[吴泽烨]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。