三维人脸稠密配准算法及其应用研究 | |
范振峰 | |
2020-09 | |
页数 | 128 |
学位类型 | 博士 |
中文摘要 | 三维人脸的稠密配准旨在对表示不同三维人脸的空间数据进行精细和稠密的匹配。良好的稠密配准是三维人脸分析的前提,其后续应用包括三维人脸重建、三维人脸识别以及三维人脸建模和仿真等等。三维人脸的稠密配准也为关于二维人脸图像的很多任务提供了新的解决方案。相比于二维人脸图像,三维人脸包含额外的几何信息,这些几何信息在人脸的不同姿态角的成像和环境光的照射下具有较好的鲁棒性,其同样有助于解决人脸表情变化引起的若干问题。总的来说,三维人脸的稠密配准有两大主要功能:(1)它使得不同三维人脸能表达成一个统一的向量形式,有利于进一步做数据分析;(2)相比于稀疏的锚点对应,稠密配准不但能刻画人脸的整体结构,而且能刻画人脸的细微结构。 三维人脸稠密配准也属于点云的非刚性配准范畴,是一个重要且具有挑战性的问题。其难点在于:(1)数学意义上,相比于刚性配准,该问题并没有显式的数学表达式来进行优化求解;(2)物理意义上,相比于少数的人脸锚点,大多数稠密的点并没有确切的解剖学定义。这同样为配准结果的评价增加了不确定性。 三维人脸稠密配准有着诸多的后续应用。一方面,它是三维人脸数据分析的前提,通过稠密配准建立的三维人脸可以进行统计分析,较好地稀疏建模表示新的三维人脸数据,其可直接应用于在三维人脸识别。另一方面,建立稠密配准的三维人脸有益于从根本成像原理上为二维人脸图像任务提供正确的指导方案,激发新的问题解决思路。 本文从三维人脸稠密配准这一根本问题入手,分析了三维人脸模型的建立过程,也研究了其相关的三维人脸和二维人脸图像方面的应用工作。其主要内容和创新点如下: (1) (2) (3) (4) (5) |
英文摘要 | Dense registration of 3D faces seeks accurate matching and canonical representation of 3D facial data, which is fundamental in a number of downstream applications in the field of 3D facial analysis, such as 3D face reconstruction, 3D face recognition, and 3D face animation. Dense registration of 3D faces also provides clue for many vision tasks of 2D facial images. Compared to its 2D counterpart, 3D face contains extra geometric information which is stable under different poses and illumination conditions, and also can be used to solve expression variations. The benefits of dense correspondence are generally two-fold: 1) the one-to-one correspondence of points between different faces allows them to be organized in the same vector space, enabling convenience for further data analysis; 2) compared to sparse representations such as landmarks only, dense representations capture local as well as global structures of faces, providing more detailed information. Dense registration of 3D faces remains a challenging problem which belongs to the class of point cloud non-rigid registration. In the mathematical view, unlike the rigid case, the non-rigid registration problem has no explicit formulation. In the physical view, while locating landmarks on 3D faces can be guided by the common knowledge of the anatomical structures, correspondence of points on smooth regions has no solid definition. This also raises difficulties in assessing the correspondence results. Dense registration of 3D faces contributes to many applications in both the 3D and 2D cases. On the one hand, statistical analysis of 3D faces is highly dependent on the dense registration results, and sparse representation of 3D faces can be directly applied to 3D face recognition. On the other hand, the study of 3D faces provides new physical insights for solving problems of 2D facial images. This dissertation originates from establishing accurate correspondence of 3D faces, based on which the author elaborates the process of building 3D face models and studies some applications for 3D faces and 2D facial images. The main contributions of this dissertation are as follows: (1) The author proposes an automatic method for dense registration of 3D faces without landmarks. Generality the landmarks require manual annotation and are hard to define consistently across different faces with partial data. The author proposes a generally framework to revisit the dense registration problem in two perspectives. One is semantic correspondence, which guarantees that the corresponded points share the same semantic meaning. The other is topological correspondence, which guarantees that the corresponded points lie in the same local context. The high-entropy points, which are automatically detected, are employed to replace the landmarks for automatic correspondence. (2) The author proposes to boost local shape matching for dense registration of 3D faces. The proposed method alleviates the negative effect of incoherent local deformations caused by landmark guidance. The dense registration problem of 3D face is considered as many locally rigid motions with explicit formulation. More specifically, the weights for each rigid motion are adjusted according to their distances to the key points. The key points are initialized by a few landmarks, and are augmented adaptively in regions with large registration errors. The registration finally converges as the key points increase. (3) The author studies some practical issues on 3D face reconstruction and recognition based on 3D face models. First, a 3D face model which is more adaptive to Asian groups is established and applied to 3D face reconstruction. Then, a well-established 3D face model is demonstrated to benefit robust registration of 3D faces, which can deal with data with noise, occlusions, and large expressions. Finally, the corresponded results can be directly applied to 3D face recognition, demonstrating the effectiveness using both the holistic and regional structures of 3D faces. (4) The author proposes effective ways to incorporate 3D depth information for the super-resolution task of 2D facial images. The convolutional neural networks are employed for the 2D facial image super-resolution. The convolution is a translation-invariant operation which considers the 2D local receptive field. One probable predicament is that the 3D neighbors are ignored. The author proposes a network architecture with a Unet structure to learn the depth map from a facial image. The learned depth map is further fed into the modulation of features for the main super-resolution task. The proposed network leads to both quantitative and qualitative improvements for the face super-resolution task, especially for sharper details of facial edges. (5) The author proposes an effective way for data augmentation based on perturbations on the low-dimensional space of facial images. First, the author carries out the study for 2D facial images and conducts reasonable perturbations on shape and appearance. This is applied to the super-resolution task of facial images and the results show notable improvements without altering the basic structures of the convolutional neural networks. Then, the author carries on the study for 2D facial image in a 3D perspective. 3D facial pose and shape are perturbated to generate novel appearances of a single 2D facial image. The 3D method gains some improvements over the 2D method also without altering the network structures for the face super-resolution task. |
关键词 | 三维人脸模型 非刚性配准 稠密对应 人脸图像超分辨重建 卷积神经网络 低维空间扰动 |
语种 | 中文 |
七大方向——子方向分类 | 图像视频处理与分析 |
文献类型 | 学位论文 |
条目标识符 | http://ir.ia.ac.cn/handle/173211/40564 |
专题 | 毕业生_博士学位论文 |
通讯作者 | 范振峰 |
推荐引用方式 GB/T 7714 | 范振峰. 三维人脸稠密配准算法及其应用研究[D]. 中国科学院自动化研究所. 中国科学院大学,2020. |
条目包含的文件 | ||||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 | ||
范振峰毕业论文提交版.pdf(52320KB) | 学位论文 | 限制开放 | CC BY-NC-SA |
个性服务 |
推荐该条目 |
保存到收藏夹 |
查看访问统计 |
导出为Endnote文件 |
谷歌学术 |
谷歌学术中相似的文章 |
[范振峰]的文章 |
百度学术 |
百度学术中相似的文章 |
[范振峰]的文章 |
必应学术 |
必应学术中相似的文章 |
[范振峰]的文章 |
相关权益政策 |
暂无数据 |
收藏/分享 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论