|Place of Conferral||中国科学院自动化研究所|
|Keyword||三维人脸模型 非刚性配准 稠密对应 人脸图像超分辨重建 卷积神经网络 低维空间扰动|
Dense registration of 3D faces seeks accurate matching and canonical representation of 3D facial data, which is fundamental in a number of downstream applications in the field of 3D facial analysis, such as 3D face reconstruction, 3D face recognition, and 3D face animation. Dense registration of 3D faces also provides clue for many vision tasks of 2D facial images. Compared to its 2D counterpart, 3D face contains extra geometric information which is stable under different poses and illumination conditions, and also can be used to solve expression variations. The benefits of dense correspondence are generally two-fold: 1) the one-to-one correspondence of points between different faces allows them to be organized in the same vector space, enabling convenience for further data analysis; 2) compared to sparse representations such as landmarks only, dense representations capture local as well as global structures of faces, providing more detailed information.
Dense registration of 3D faces remains a challenging problem which belongs to the class of point cloud non-rigid registration. In the mathematical view, unlike the rigid case, the non-rigid registration problem has no explicit formulation. In the physical view, while locating landmarks on 3D faces can be guided by the common knowledge of the anatomical structures, correspondence of points on smooth regions has no solid definition. This also raises difficulties in assessing the correspondence results.
Dense registration of 3D faces contributes to many applications in both the 3D and 2D cases. On the one hand, statistical analysis of 3D faces is highly dependent on the dense registration results, and sparse representation of 3D faces can be directly applied to 3D face recognition. On the other hand, the study of 3D faces provides new physical insights for solving problems of 2D facial images.
This dissertation originates from establishing accurate correspondence of 3D faces, based on which the author elaborates the process of building 3D face models and studies some applications for 3D faces and 2D facial images. The main contributions of this dissertation are as follows:
(1) The author proposes an automatic method for dense registration of 3D faces without landmarks. Generality the landmarks require manual annotation and are hard to define consistently across different faces with partial data. The author proposes a generally framework to revisit the dense registration problem in two perspectives. One is semantic correspondence, which guarantees that the corresponded points share the same semantic meaning. The other is topological correspondence, which guarantees that the corresponded points lie in the same local context. The high-entropy points, which are automatically detected, are employed to replace the landmarks for automatic correspondence.
(2) The author proposes to boost local shape matching for dense registration of 3D faces. The proposed method alleviates the negative effect of incoherent local deformations caused by landmark guidance. The dense registration problem of 3D face is considered as many locally rigid motions with explicit formulation. More specifically, the weights for each rigid motion are adjusted according to their distances to the key points. The key points are initialized by a few landmarks, and are augmented adaptively in regions with large registration errors. The registration finally converges as the key points increase.
(3) The author studies some practical issues on 3D face reconstruction and recognition based on 3D face models. First, a 3D face model which is more adaptive to Asian groups is established and applied to 3D face reconstruction. Then, a well-established 3D face model is demonstrated to benefit robust registration of 3D faces, which can deal with data with noise, occlusions, and large expressions. Finally, the corresponded results can be directly applied to 3D face recognition, demonstrating the effectiveness using both the holistic and regional structures of 3D faces.
(4) The author proposes effective ways to incorporate 3D depth information for the super-resolution task of 2D facial images. The convolutional neural networks are employed for the 2D facial image super-resolution. The convolution is a translation-invariant operation which considers the 2D local receptive field. One probable predicament is that the 3D neighbors are ignored. The author proposes a network architecture with a Unet structure to learn the depth map from a facial image. The learned depth map is further fed into the modulation of features for the main super-resolution task. The proposed network leads to both quantitative and qualitative improvements for the face super-resolution task, especially for sharper details of facial edges.
(5) The author proposes an effective way for data augmentation based on perturbations on the low-dimensional space of facial images. First, the author carries out the study for 2D facial images and conducts reasonable perturbations on shape and appearance. This is applied to the super-resolution task of facial images and the results show notable improvements without altering the basic structures of the convolutional neural networks. Then, the author carries on the study for 2D facial image in a 3D perspective. 3D facial pose and shape are perturbated to generate novel appearances of a single 2D facial image. The 3D method gains some improvements over the 2D method also without altering the network structures for the face super-resolution task.
|范振峰. 三维人脸稠密配准算法及其应用研究[D]. 中国科学院自动化研究所. 中国科学院大学,2020.|
|Files in This Item:|
|范振峰毕业论文提交版.pdf（52320KB）||学位论文||限制开放||CC BY-NC-SA||Application Full Text|
|Recommend this item|
|Export to Endnote|
|Similar articles in Google Scholar|
|Similar articles in Baidu academic|
|Similar articles in Bing Scholar|
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.