基于回归方法的单目相机人脸重建研究

	基于回归方法的单目相机人脸重建研究
	王鹏睿
	2020-08-19
页数	126
学位类型	博士
中文摘要	三维虚拟人脸技术在个性客服、社交服务与演艺生态等领域中具有广阔的应用场景。现有工业技术定制逼真的数字化人脸需要大量的硬件、人工、数据、时间，如何以简易的智能方式解决人脸生成是一个技术难题与研究热点。近年来，基于回归方法的单目相机人脸重建以其流程简洁、速度快、鲁棒性强等优点被广泛研究。但是，该方法也存在训练数据缺乏、模型表达能力不足、逼真细节恢复困难等问题。针对上述问题，本文从重建方法、模型结构、训练策略等方面入手，改进了基于回归方法的单目相机人脸几何重建与表观建模的实现方法。主要研究成果如下： 1. 为归纳现有的以及衍生新的人脸重建方法，本文提出一个级联结构的三维人脸外形重建框架。框架自然融合人脸对齐过程，无需关键点检测，并且能够衍生出不同的重建方法。框架能够组合不同算法模块获得不同特性的重建方法，如本文衍生出的“参数增广回归”方法，通过结合人脸对齐算法——局部二值特征（LBF，Local Binary Feature）的模块能够获得计算速度和重建准确度均高的人脸重建方法。 2. 针对弱监督回归方法缺少高质量的人脸关键点数据的问题，提出基于多关键点数据库融合的弱监督人脸重建方法。该方法有效地融合了关键点类型与数目不同的人脸数据库，增加了单个人脸关键点的数量和数据的多样性，并通过加入基于网格形变场的形状更正层，丰富了人脸参数模型的表达能力，最终提升了弱监督人脸重建方法的精确度和鲁棒性。 3. 针对基于UV空间人脸重建方法的缺陷，提出一个在UV空间回归形状、反射率和法向的人脸重建方法。本文提出用尽可能刚性的三维网格曲面展开方法获得UV图，其在几何结构与拓扑结构上比常用的圆柱展开法更合理，有利于提高形状、反射率和法向的预测能力。此外，提出了融入法向信息的基于模板变形的形状处理方法解决形状噪声问题，使形成的三维人脸网格更加真实。 4. 针对目前弱监督人脸重建方法缺少高频反射率及几何细节的问题，提出一种在低频人脸反射率信息指导下，训练预测高质量皮肤反射率和几何细节回归模型的方法。该方法通过自监督训练的启动网络获得合理的光照估计和反射率分布，经弱监督训练图到图结构的人脸反射率网络（FAN，Facail Albedo Network）和细节恢复网络（DRN，Detail Recovery Network）。FAN在恢复完整的反射率及去除遮挡的同时尽可能保留高频细节信息。DRN利用像素梯度损失训练并预测沿法线方向的细节，获得了鲁棒的细节几何。实验表明FAN与DRN联合能重建出逼真的人脸模型。
英文摘要	Three-dimensional(3D) virtual face technology has a wide range of application scenarios in the fields of personalized customer service, social services and performing arts ecology. The existing technology requires a lot of hardware, labor, data, and time to customize realistic digital faces. How to solve the avatar generation in a simple and intelligent way is a technical problem. In recent years, face reconstruction of monocular camera based on regression method has been widely studied for its advantages of simple process, fast speed and strong robustness. However, this method also has problems such as lack of training data, insufficient model expression ability, and difficulty in recovering realistic details. Regarding the issue above, this paper studies the problem of face geometry and material reconstruction of monocular camera based on regression method from the aspects of reconstruction method, model structure, training strategy, etc. The main research results are as follows: 1. In order to summarize the existing and new face reconstruction methods, this paper proposes a 3D face reconstruction framework with cascaded structure. The framework naturally integrates the face alignment process without key point detection, and can derive different reconstruction methods. So the framework can obtain different reconstruction methods by combining different algorithm modules. The "parameter augmented regression" method derived in this paper, through combination of the local binary feature (LBF) face alignment algorithm, obtains a face reconstruction method with high computing speed and high reconstruction accuracy. 2. Aiming at the lack of high quality face landmarks in weak supervised regression method, a face reconstruction method for weakly supervised parameter regression training based on landmark database fusion is proposed. This method effectively integrates face databases with different landmark types and numbers, which increases the number of labeled landmarks and the diversity of training data. In addition, the shape correction layer based on the grid deformation field is added to enrich the expression ability of the face parameter model. These methods finally improve the accuracy and robustness of the weakly supervised face reconstruction method. 3. Aiming at the defects of face reconstruction method based on UV space, a face reconstruction regression method that returns shape, reflectance and normal in UV space is proposed. In this paper, we propose a method of 3D mesh surface expansion as rigid as possible to obtain UV map, which is more reasonable in geometry and topology than the commonly used cylinder expansion method, and is conducive to improving the prediction ability of shape, reflectivity and normal direction. This paper also proposes a shape processing method based on template deformation which integrates normal information to solve the problem of shape noise, making the 3D human face mesh more real. 4. Aiming at the problem of the lack of details in current weakly supervised face reconstruction methods, a method for training a high-quality skin albdeo and geometric detail regression model under the guidance of low-frequency face albdeo information is proposed. It obtains reasonable illumination estimation and albedo distribution through self-supervised training start-up network, and the facail albedo network (FAN) and detail recovery network (DRN) were constructed from the weakly supervised training map. FAN restores the complete albdeo and removing occlusions while retaining high-frequency details as much as possible. DRN uses pixel gradient loss to train and predict the details along the normal direction, which helps obtaining robust detail geometry. Experiments show that the combination of FAN and DRN can reconstruct realistic 3D face meshes.
关键词	三维人脸重建弱监督学习明暗成形网格形变单目相机
语种	中文
七大方向——子方向分类	计算机图形学与虚拟现实
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/40393
专题	复杂系统认知与决策实验室_听觉模型与认知计算
推荐引用方式 GB/T 7714	王鹏睿. 基于回归方法的单目相机人脸重建研究[D]. 中国科学院自动化研究所. 中国科学院大学,2020.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
基于回归方法的单目相机人脸重建研究-王鹏（7276KB）	学位论文		开放获取	CC BY-NC-SA