虚拟人中人脸动画和姿态控制的研究

CASIA OpenIR > 毕业生 > 博士学位论文

	虚拟人中人脸动画和姿态控制的研究
其他题名	Research on Face Animation and Gesture Control in Virtual Agent
	穆凯辉
	2011-05-29
学位类型	工学博士
中文摘要	虚拟人脸动画技术是人机交互领域一个重要的研究方向。围绕人脸动画的生成方法以及人脸动画表达的逼真效果，在可视语音合成、可视韵律合成、人脸情感表达方面已经做出了很大一部分成果。在人脸动画领域中，使用模态映射的方法生成人脸动画的工作越来越突出，在应用中也占据越来越大的比重，但是如何提高映射模型的质量，并利用该映射模型来产生更加逼真的人脸动画，一直是一个比较复杂的难题。由于用户对人脸拓扑结构异常熟悉，以及人脸拓扑结构的异常复杂，现阶段能被用户所广泛接受的人脸动画系统还少之又少。随着人机交互的发展，人们对交互式虚拟人的应用越来越关注。交互式虚拟人反映了人类对自身虚拟化的持续关注。它也代表了很多先进技术的发展水平。和人脸表达不同的是，虚拟人表达更多的是使用骨骼模型，这就需要在骨骼模型上建立相应的人脸动画和身体姿态控制等。这些所有的方面都要做好往往很难，这也是交互式虚拟人系统受到持续关注的原因之一。本文试图从多模态融合的角度来建立人脸动画的映射模型，并在此基础上，结合三维商业建模软件来建立一个交互式虚拟人平台，并将其应用到具体的应用场景中。围绕人脸动画和交互式虚拟人，本文的主要工作有： 1 提出了一种基于基元选取的方法来解决语音驱动人脸唇部动画问题。语音驱动人脸动画一直是人脸动画方面比较热的研究方向。如何在语音和人脸唇动之间建立映射模型，并在合成时得到逼真而平滑的人脸唇动，并且能够实现实时性的系统一直是研究的关注点。本文主要针对如何实现一个实时的人脸唇部动画系统而建立起一个简单而有效的映射模型。利用该映射模型，能生成平滑而逼真的语音驱动唇部同步动画。由该算法实现的系统，易于实现，可同时用于男女声的语音驱动人脸动画，能很好满足实时性的要求。提出了一种基于两层聚类以及决策树相结合的算法来解决句子级别上的可视韵律合成问题。针对具有弱耦合特征的文本韵律到头动的映射问题，本文在两种假设上建立这种映射模型，一是在不同的情感状态下头动的基本类型不同；二是人脸头动模型往往具有个性化。在这两种假设的基础上，本文通过分类和回归树模型建立了文本韵律到不同情感下的基本头动类型的映射，这些基本头动类型反映了同一个表演者的个性化头动模式。当输入新的文本时，通过文本分析模块提取的文本参数输入到映射模型中来求取头动的转角参数，从而生成带有韵律特征的虚拟人脸头动。通过该方法生成的虚拟人头动能极大地增强人脸动画的逼真度。 °3 建立了一个融合文本驱动唇动、可视韵律合成、人脸表情表达、身体运动的虚拟人动作表达系统。交互式虚拟人研究需要虚拟人的表达能像真实场景中人与人对话中所表现出的动作姿态，并表现出相应的智能和个性。为了建立这种拟人的虚拟人动作，本文使用映射的方法，在由控制端输入文本的基础上，建立起语音合成模块、可视语音合成模块、可视韵律模块，并通过基于规则的方法建立起人脸情感表达模块、身体表达模块。整个虚拟人表达系统能以自然娴熟的动作传达虚拟人的交互信息。总地来说，本文对人脸动画、多模态映射模型、语...
英文摘要	Face animation in virtual agent is an important research direction in human-computer interaction. Significant research e®orts have been attempted to generate realistic facial animation in visual speech synthesis, visual prosody and emotionalfacial movements. Adopting multi-modal methods to generate natural facial animation has been emphasized in these days and creates a lot of application. However, how to get a mapping model of high quality and use such a model to create realistic facial animation has been a complicated problem in facial animation.There are few real-time systems accepted by users because of the complexity of human facial anatomy and our inherent sensitivity to facial appearance. Nowadays, people have emphasized on interactive agents with the developments of human-computer interaction. Interactive agents re°ect the continuous attention of virtual representative of ourselves and the level of developments of many techniques. Unlike face animation, the expression of virtual agents usually uses the skeleton model, so it needs to create face animation and body gesture control based on skeleton. It's very di±cult that all of these parts are well-done. This is another reason why the interactive virtual characters cause a lot of attentions. This paper try to create the mapping model from the multi-modal interaction, generate a platform based on mapping models for interactive virtual character using the 3D commercial creation software, and apply these into particular application scenes. The main contributions of the thesis around facial animation and interactive agents include as follows: °1 A speech-driven facial animation based on unit selection is proposed. Speech-driven facial animation has been a vivid field of face animation. How to create mapping relationship between speech and lips, synthesize realistic and natural lip synchronization and achieve the real-time expression has been the attention of research. This paper creates a simple and effective mapping model in terms of how to realize a real-time lip synchronization. Utilizing this model, we generate a speech-driven face animation system which can create smooth and realistic lip movements. This system can meet the need of real-time, be easy to implement and be used for both man and woman's speech inputs. °2 A two-stage clustering and CART method is proposed for visual prosody in sentence level. In terms of the mapping model from prosody of text to head movements which has the...
关键词	人脸动画多模态映射语音驱动人脸动画可视韵律交互式虚拟人 Face Animation Multi-modal Mapping Speech-driven Face Animation Visual Prosody Virtual Character
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/6370
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	穆凯辉. 虚拟人中人脸动画和姿态控制的研究[D]. 中国科学院自动化研究所. 中国科学院研究生院,2011.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
CASIA_20081801462805（11535KB）			暂不开放	CC BY-NC-SA