基于元学习的跨域人脸识别与三维人脸拟合

CASIA OpenIR > 毕业生 > 博士学位论文

	基于元学习的跨域人脸识别与三维人脸拟合
	郭建珠
	2021-05
页数	138
学位类型	博士
中文摘要	人脸识别与三维人脸拟合是计算机视觉和模式识别的核心研究方向，在公共安全、信息安全、社会互动等方向得到了越来越广泛的应用。一方面，经过多年发展，人脸识别借助深度学习与大规模数据训练，在与训练数据分布类似的场景已取得很高的识别性能。然而，人脸识别系统在实际部署中，经常会遇到与训练数据分布不一致的场景，这些场景具有未知性、或者只能获取到少量样本，给摆脱样本依赖的跨域人脸识别带来了挑战。另一方面，三维人脸拟合逐步从学术研究走到业界应用，而应用层面对算法的速度、精度与稳定度均要求较高，因此实时、准确、稳定的三维人脸拟合是一个亟待研究且具有挑战性的课题。本文针对跨域人脸识别问题与实时、准确、稳定的三维人脸拟合问题进行了深入研究和分析，并提出元学习是解决这些问题的一个有效思路。元学习的核心是学习如何学习，并可以通过构建``任务''粒度的样本，来辅助模型自动训练，就能提升模型泛化或外推的能力，而无需手工去设计过多的正则项。本文借助了元学习的思想，去解决本文中的跨域人脸识别问题与三维人脸拟合问题，具体表现在：（1）通过构建跨域的任务，去辅助模型学习跨域的知识，提升在未知场景的泛化性，或在目标场景快速学习的能力；（2）通过构建子任务，来缓解参数模型拟合的困难，并在保持计算量不变的前提下，提升三维人脸重建的精度。因此，本文称之为基于元学习的跨域人脸识别与三维人脸拟合，主要的工作和贡献有：（1）针对人脸识别中目标场景未知的泛化性挑战，本文提出了一种基于元学习的人脸识别框架。该框架在训练过程中，通过采样元训练域和元测试域来模拟场景的分布偏差，并通过元优化的方式，使得模型同时在元训练和元测试域上的性能都有所提升，进而提升模型在真实未知场景下的泛化性。本文设计了两个泛化人脸识别评估协议，并通过与其他方法的对比，验证了元学习人脸识别框架的有效性；（2）针对有限样本下的无监督域自适应的问题，本文提出了一种基于元学习和模型参数解耦的人脸识别框架，避免了现有方法存在的易过拟合问题，并实现快速自适应。该框架对网络参数进行训练，尽量使得域不变的信息存储在权重参数中，而域相关的知识则被批量归一化层的统计值来表示。在给定少量目标场景的无标签样本后，仅需计算批量归一化层的统计值参数，就能快速自适应到目标场景上，并提升在目标场景上的识别性能。本文在设计的两个评估协议上的性能超过了现有的基线方法和其他相关工作，验证了基于元学习和模型参数解耦的人脸识别框架的有效性；（3）为了达到实时、准确、稳定的三维人脸拟合，本文提出了一个基于元联合优化的三维可变模型参数回归框架，来平衡速度、精度以及视频稳定度。为了同时满足速度和精度的要求，在基于轻量网络的前提下，本文提出了一种元联合优化策略来动态地回归模型参数；为了进一步提升在视频跨帧场景下的稳定性，本文提出了一种基于三维辅助的虚拟短视频生成的策略，将一张静态图片扩充为一段虚拟的视频，并结合元联合优化进行训练。在取得高精度和稳定性的同时，该框架在单核 CPU 上的速度超过了每秒 50 帧，加速后在单核 CPU 能达到每秒 200 多帧，且精度优于最先进的使用大型网络的方法。总而言之，本文针对跨域人脸识别与实时、准确、稳定的三维人脸拟合问题进行了深入的研究，借鉴元学习的思想提出了多种有效的方法，提升了人脸识别跨域的泛化性与自适应性，增强了三维人脸拟合的实用性，推动了基于元学习的跨域人脸识别与三维人脸拟合课题的发展。
英文摘要	Face recognition and 3D face fitting are important research directions in computer vision and pattern recognition and have been widely used in public security, information security, social interaction, etc. On the one hand, in the past decade, face recognition has achieved high performance in scenarios with similar distributions to the training data with the help of deep learning and large-scale data training. However, face recognition systems are usually deployed to a target domain with a distribution bias from the training data in real-world applications. These domains may be unknown, or only a small number of samples are available, which poses a great challenge to problem of cross-domain face recognition getting rid of large-scale training samples. On the other hand, 3D face fitting begins to focus on the industrial application instead of the academic field. Therefore, real-time, accurate and stable 3D model fitting is an urgent problem to be studied. This thesis studies and analyses cross-domain face recognition and practical 3D face model fitting and argues that meta-learning is a compelling idea to solve them. At the core of meta-learning is to learn to learn and can be used to improve the generalization ability of model via constructing and meta-optimizing sub-tasks. In this thesis, we adopt meta-learning to help (i) the face recognition model to learn cross-domain knowledge and improve the generalization in unseen domains or the fast domain-adaptive ability of the model in target domains, and (ii) the network to learn how to regress model parameters better, thus to improve the accuracy of 3D face while keeping the computational complexity unchanged. Therefore, this thesis is titled research on meta learning based cross-domain face recognition and 3D face fitting. The main contributions are: (i) To address the challenge of face recognition on unseen domains, this thesis proposes a novel face recognition framework via meta-learning named meta face recognition. This framework synthesizes the source/target domain shift with a meta optimization objective, which requires the model to learn effective representations not only on synthesized source domains but also on synthesized target domains. By doing so, the model achieves better generalization on unseen domains. Besides, this thesis proposes two benchmarks for generalized face recognition evaluation. Experiments on the benchmarks validate the generalization of the proposed framework compared to several baselines and other state-of-the-arts. (ii) To solve the domain-adaptive problem of face recognition with limited samples on target domains, this thesis proposes a novel framework via meta-learning and the decomposing to avoid over-fitting and achieve fast adaptation. The proposed framework trains the network such that domain invariant information is prone to store in the weight parameters and domain-specific knowledge tends to be represented by the BN statistics. With the learned weight parameters, the adaptation is very fast since only the BN updating on limited data is needed. Extensive experiments on two designed benchmarks validate the efficacy of the proposed framework. (iii) To achieve a fast, accurate and stable 3D face fitting, this thesis proposes a novel regression framework via meta-optimization making a balance among speed, accuracy and stability. Firstly, on the basis of a lightweight backbone, this thesis proposes a meta-joint optimization strategy to dynamically regress a small set of 3DMM parameters, which greatly enhances speed and accuracy simultaneously. To further improve the stability on videos, this thesis presents a 3D-aided virtual synthesis method to transform one still image to a short-synthesis-video, and trains the network with the generated short-synthesis-video. On the premise of high accuracy and stability, the model runs at over 50fps on a single CPU core and over 200fps after the inference acceleration and outperforms other state-of-the-art heavy models simultaneously.
关键词	人脸识别三维人脸模型元学习领域自适应领域泛化
语种	中文
七大方向——子方向分类	生物特征识别
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/44768
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	郭建珠. 基于元学习的跨域人脸识别与三维人脸拟合[D]. 中科院自动化研究所. 中国科学院大学,2021.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
基于元学习的跨域人脸识别与三维人脸拟合_（5361KB）	学位论文		限制开放	CC BY-NC-SA