基于生成对抗网络的人脸图像增强方法研究

CASIA OpenIR > 智能制造技术与系统研究中心 > 多维数据分析（彭思龙）-技术团队

	基于生成对抗网络的人脸图像增强方法研究
	窦昊
	2021-05-28
页数	120
学位类型	博士
中文摘要	近年来，基于大数据的智慧公共安全体系得到快速发展和完善，各种成像设备采集到的人脸图像在刑侦安防场景中起到了重要的作用，这些人脸图像可以作为关键线索和证据为警方办案提供辅助性的指导。本文主要研究人脸图像增强方法，旨在针对刑侦安防场景中出现的异质、低质等特点的人脸图像进行有目的地增强变换处理，改善待增强人脸图像的肉眼视觉效果、识别比对效果，提升这类人脸图像在侦查办案中的应用价值。人脸图像增强任务可以视为一种典型的条件式图像生成任务。现阶段，基于生成模型的人脸图像增强研究仍然存在一定的局限和挑战，比如小样本和非监督情况下的模型稳定性、人脸图像复原过程中视觉效果和噪声失真的权衡，以及人脸属性的平滑变换等。本文以生成对抗网络这一深度生成模型作为主要研究基础，针对刑侦安防场景下人脸图像面临的典型增强问题和挑战进行深入研究，结合人脸任务本身的先验知识，改进和发展对抗生成理论和方法，在人脸图像增强任务上实现较好的增强目标。本文的主要内容和创新点归纳如下： 1. 本文针对近红外人脸图像增强问题，以提升生成模型在无监督小样本情况下的图像翻译能力为目标，主要研究了生成对抗网络在非配对训练任务中的不对称现象，提出了非对称 CycleGAN (Asymmetric CycleGAN) 模型来改善不对称训练引起的过拟合和不稳定问题。该方法给出了非配对图像翻译任务的不对称性度量指标，通过该指标来改进经典的非配对图像翻译模型 CycleGAN，指导其生成器的网络复杂度设计，构建合理的非对称 CycleGAN 结构，提升了模型在不对称翻译任务的非配对训练效果。进一步地，本文在非对称 CycleGAN 基础上，针对人脸引入细致的边缘先验约束，增强生成人脸图像的细节保持能力，提升了待增强图像的视觉效果和人脸识别性能，有效地实现了从近红外人脸向可见光彩色人脸图像的转换，并且在相关数据集和真实场景数据中得到了测试和验证。 2. 本文针对低质量人脸图像增强问题，以兼顾生成模型在图像复原时的感知效果和失真程度为目标，将超分辨增强问题作为研究的切入口，为生成对抗网络设计了一种面向主成分空间的增量式投影判别的学习方法 (PCAGAN) ，可以简单有效地重建人脸信息的复杂分布。该方法面向人脸数据，引入主成分分析 (PCA) 的投影空间，通过向 PCA 空间中投影来实现人脸信息的层次化展开，然后利用对抗学习的方式增量式判别投影成分，从而指导生成器的由粗到细地生成人脸信息。该方法实现了一种平滑渐进的生成对抗网络训练方式，较好地完成了兼顾低失真和高感知效果的人脸重建目标。通过相关数据集上的量化指标和视觉效果比较，显著表明了该方法在人脸超分辨问题上的有效性, 结合真实场景中的低质量人脸样本增强实验，有效验证了其在低质量人脸图像增强问题上的实用性。 3. 本文针对跨年龄人脸图像变换增强问题，以改善生成模型在属性变换时的细致生成效果为目标，提出了 FA-FlowGAN 模型，借助可逆的标准化流方法，提升了生成对抗网络在隐变量空间的表示和映射能力，实现了可解释和可控地指定年龄人脸生成效果。该方法以简单的先验概率分布映射年龄属性空间，并将属性变量嵌入到自编码式的生成器结构中。结合流模型对属性分布的精确概率建模能力，该方法利用生成对抗损失和年龄分类损失较好地实现了指定年龄的人脸图像生成，在维持人脸身份的前提下，保证了人脸图像随年龄隐变量的平滑渐变采样，取得了高质量的跨年龄人脸图像转换效果，相关数据集上的实验结果证明了方法的有效性。
英文摘要	In recent years, the intelligent public security system has been developed and improved rapidly. The face images collected by various imaging devices play an important role in the criminal investigation and security scene, and can be used as key clues and evidence to provide auxiliary guidance for the police to handle cases. In this paper, the research on face image enhancement aims at purposefully enhancing and transforming the heterogeneous and low-quality face images in the scene of criminal investigation and security, improving the visual effect and recognition performance of the face images, and enhancing the application value of face images in real scene. Face image enhancement task can be regarded as a typical conditional image generation task. At present, there are still some limitations and challenges in the research of face image enhancement based on generative model, such as the model stability under small sample and unsupervised conditions, the trade-off between visual effect and noise distortion for face restoration, and the smooth transformation of face attributes. Based on the Generative Adversarial Network, which is a type of deep generative models, this paper studies the typical problems and challenges of face image enhancement in criminal investigation scene. Combined with the prior knowledge of face task, we improve and develop the theories and methods of adversarial generation, and achieve better performance for face image enhancement tasks. The main contents and contributions of this paper are described as follows： 1. In order to improve the image translation ability of the generative model in the condition of unsupervised small samples, and aiming at the problem of near-infrared face image enhancement, this paper mainly studies the asymmetry of GAN models in the unpaired image translation task, and proposes an symmetric CycleGAN model to overcome the over-fitting and instability caused by asymmetric training. The method gives the quantitative index to measure the asymmetry of unpaired image translation task. This index can be used to improve the unpaired image translation model CycleGAN, to guide the network designs of its generator and construct a reasonable Asymmetric CycleGAN structure, which can improve the unpaired training performance of the model in asymmetric translation tasks. Furthermore, the edge prior constraint is introduced to enhance the detail preserving ability of face images. It improves the visual and recognition effect, and translates the near infrared face image to the visible color face image effectively. The model has been tested and verified on the relevant datasets and the data in real scene. 2. In order to get a better trade-off of the perception and distortion for the generative model in image restoration, and aiming at the problem of low-quality face image enhancement, this paper takes the super-resolution enhancement problem as the research entry point, and proposes a novel cumulative discrimination and reconstruction approach referred as PCAGAN , which focuses on the incremental orthogonal projection discrimination in the PCA orthogonal subspace. The hierarchical expansion of face information is obtained by projecting the face to the PCA subspace, and then the incremental projection components are discriminated by the discriminator, so as to guide the generator to generate facial information from coarse to fine. This method achieves a smooth training procedure for Generative Adversarial Network, and a better performance of both high perception and low distortion for face image reconstruction. By the comparison of quantitative evaluation and visual effects on the relevant datasets, the effectiveness of the model in face super-resolution problem is significantly demonstrated. Combined with the enhancement experiment for low-quality face samples in real scene, the practicability of the model in low-quality face image enhancement problem is effectively verified. 3. In order to improve the performance of generative model for attribute transformation task, this paper proposes FA-FlowGAN model for cross-age face image transformation. With the help of reversible Normalized Flow method, the model improves the representation and mapping ability of Generative adversarial Network in hidden variable space, and achieves interpretable and controllable generation performance for age-specified imaging. In this method, the age attribute space is mapped by a simple and easy-sampling prior probability distribution, and the attribute variables are embedded into the generator with auto-encoder structure. Combined with the ability of accurate probability modeling for attribute distribution which belongs to the flow model, this method can generate face images with specified age by adversarial loss and age classification loss. While maintaining the face identity, it ensures smooth sampling of face images with specified ages and achieves high quality performance of cross-age face image translation. Experiments on relevant datasets show the effectiveness of the method.
关键词	人脸图像增强生成对抗网络近红外人脸图像增强人脸超分辨跨年龄人脸变换
语种	中文
七大方向——子方向分类	机器学习
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/44442
专题	智能制造技术与系统研究中心_多维数据分析（彭思龙）-技术团队
推荐引用方式 GB/T 7714	窦昊. 基于生成对抗网络的人脸图像增强方法研究[D]. 中国科学院大学. 中国科学院大学,2021.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
Thesis.pdf（13819KB）	学位论文		开放获取	CC BY-NC-SA