基于生成对抗网络的人脸图像增强方法研究 | |
窦昊 | |
2021-05-28 | |
页数 | 120 |
学位类型 | 博士 |
中文摘要 | 近年来,基于大数据的智慧公共安全体系得到快速发展和完善,各种成像设备采集到的人脸图像在刑侦安防场景中起到了重要的作用,这些人脸图像可以作为关键线索和证据为警方办案提供辅助性的指导。本文主要研究人脸图像增强方法,旨在针对刑侦安防场景中出现的异质、低质等特点的人脸图像进行有目的地增强变换处理,改善待增强人脸图像的肉眼视觉效果、识别比对效果,提升这类人脸图像在侦查办案中的应用价值。 人脸图像增强任务可以视为一种典型的条件式图像生成任务。现阶段,基于生成模型的人脸图像增强研究仍然存在一定的局限和挑战,比如小样本和非监督情况下的模型稳定性、人脸图像复原过程中视觉效果和噪声失真的权衡,以及人脸属性的平滑变换等。本文以生成对抗网络这一深度生成模型作为主要研究基础,针对刑侦安防场景下人脸图像面临的典型增强问题和挑战进行深入研究,结合人脸任务本身的先验知识,改进和发展对抗生成理论和方法,在人脸图像增强任务上实现较好的增强目标。 本文的主要内容和创新点归纳如下: 1. 本文针对近红外人脸图像增强问题,以提升生成模型在无监督小样本情况下的图像翻译能力为目标,主要研究了生成对抗网络在非配对训练任务中的不对称现象,提出了非对称 CycleGAN (Asymmetric CycleGAN) 模型来改善不对称训练引起的过拟合和不稳定问题。该方法给出了非配对图像翻译任务的不对称性度量指标,通过该指标来改进经典的非配对图像翻译模型 CycleGAN,指导其生成器的网络复杂度设计,构建合理的非对称 CycleGAN 结构,提升了模型在不对称翻译任务的非配对训练效果。进一步地,本文在非对称 CycleGAN 基础上,针对人脸引入细致的边缘先验约束,增强生成人脸图像的细节保持能力,提升了待增强图像的视觉效果和人脸识别性能,有效地实现了从近红外人脸向可见光彩色人脸图像的转换,并且在相关数据集和真实场景数据中得到了测试和验证。 2. 本文针对低质量人脸图像增强问题,以兼顾生成模型在图像复原时的感知效果和失真程度为目标,将超分辨增强问题作为研究的切入口,为生成对抗网络设计了一种面向主成分空间的增量式投影判别的学习方法 (PCAGAN) ,可以简单有效地重建人脸信息的复杂分布。该方法面向人脸数据,引入主成分分析 (PCA) 的投影空间,通过向 PCA 空间中投影来实现人脸信息的层次化展开,然后利用对抗学习的方式增量式判别投影成分,从而指导生成器的由粗到细地生成人脸信息。该方法实现了一种平滑渐进的生成对抗网络训练方式,较好地完成了兼顾低失真和高感知效果的人脸重建目标。通过相关数据集上的量化指标和视觉效果比较,显著表明了该方法在人脸超分辨问题上的有效性, 结合真实场景中的低质量人脸样本增强实验,有效验证了其在低质量人脸图像增强问题上的实用性。
|
英文摘要 | In recent years, the intelligent public security system has been developed and improved rapidly. The face images collected by various imaging devices play an important role in the criminal investigation and security scene, and can be used as key clues and evidence to provide auxiliary guidance for the police to handle cases. In this paper, the research on face image enhancement aims at purposefully enhancing and transforming the heterogeneous and low-quality face images in the scene of criminal investigation and security, improving the visual effect and recognition performance of the face images, and enhancing the application value of face images in real scene. Face image enhancement task can be regarded as a typical conditional image generation task. At present, there are still some limitations and challenges in the research of face image enhancement based on generative model, such as the model stability under small sample and unsupervised conditions, the trade-off between visual effect and noise distortion for face restoration, and the smooth transformation of face attributes. Based on the Generative Adversarial Network, which is a type of deep generative models, this paper studies the typical problems and challenges of face image enhancement in criminal investigation scene. Combined with the prior knowledge of face task, we improve and develop the theories and methods of adversarial generation, and achieve better performance for face image enhancement tasks. The main contents and contributions of this paper are described as follows: 1. In order to improve the image translation ability of the generative model in the condition of unsupervised small samples, and aiming at the problem of near-infrared face image enhancement, this paper mainly studies the asymmetry of GAN models in the unpaired image translation task, and proposes an symmetric CycleGAN model to overcome the over-fitting and instability caused by asymmetric training. The method gives the quantitative index to measure the asymmetry of unpaired image translation task. This index can be used to improve the unpaired image translation model CycleGAN, to guide the network designs of its generator and construct a reasonable Asymmetric CycleGAN structure, which can improve the unpaired training performance of the model 2. In order to get a better trade-off of the perception and distortion for the generative model in image restoration, and aiming at the problem of low-quality face image enhancement, this paper takes the super-resolution enhancement problem as the research entry point, and proposes a novel cumulative discrimination and reconstruction approach referred as PCAGAN , which focuses on the incremental orthogonal projection discrimination in the PCA orthogonal subspace. The hierarchical expansion of face information is obtained by projecting the face to the PCA subspace, and then the incremental projection components are discriminated by the discriminator, so as to guide the generator to generate facial information from coarse to fine. This method achieves a smooth training procedure for Generative Adversarial Network, and a better 3. In order to improve the performance of generative model for attribute transformation task, this paper proposes FA-FlowGAN model for cross-age face image transformation. With the help of reversible Normalized Flow method, the model improves the representation and mapping ability of Generative adversarial Network in hidden variable space, and achieves interpretable and controllable generation performance for age-specified imaging. In this method, the age attribute space is mapped by a simple and easy-sampling prior probability distribution, and the attribute variables are embedded into the generator with auto-encoder structure. Combined with the ability of accurate probability modeling for attribute distribution which belongs to the flow model, this method can generate face images with specified age by adversarial loss and age classification loss. While maintaining the face identity, it ensures smooth sampling of face images with specified ages and achieves high quality performance of cross-age face image translation. Experiments on relevant datasets show the effectiveness of the method. |
关键词 | 人脸图像增强 生成对抗网络 近红外人脸图像增强 人脸超分辨 跨年龄人脸变换 |
语种 | 中文 |
七大方向——子方向分类 | 机器学习 |
文献类型 | 学位论文 |
条目标识符 | http://ir.ia.ac.cn/handle/173211/44442 |
专题 | 智能制造技术与系统研究中心_多维数据分析(彭思龙)-技术团队 |
推荐引用方式 GB/T 7714 | 窦昊. 基于生成对抗网络的人脸图像增强方法研究[D]. 中国科学院大学. 中国科学院大学,2021. |
条目包含的文件 | ||||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 | ||
Thesis.pdf(13819KB) | 学位论文 | 开放获取 | CC BY-NC-SA |
个性服务 |
推荐该条目 |
保存到收藏夹 |
查看访问统计 |
导出为Endnote文件 |
谷歌学术 |
谷歌学术中相似的文章 |
[窦昊]的文章 |
百度学术 |
百度学术中相似的文章 |
[窦昊]的文章 |
必应学术 |
必应学术中相似的文章 |
[窦昊]的文章 |
相关权益政策 |
暂无数据 |
收藏/分享 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论