CASIA OpenIR  > 毕业生  > 博士学位论文
基于深度学习的自动人脸年龄估计研究
李凯
学位类型工学博士
导师胡卫明 ; 兴军亮
2018-05-25
学位授予单位中国科学院大学
学位授予地点北京
关键词人脸年龄估计 深度学习 多任务学习 样本不均衡学习 代价敏感学习
其他摘要

年龄作为一种重要的人脸属性,在人机交互、智能商务、安全监控和娱乐等领域都有着广阔的应用前景。自动人脸年龄估计作为一种重要的生物特征识别技术,目前已经成为模式识别和计算机视觉领域内一个热门的研究课题。人脸年龄估计问题的定义是指采用计算机视觉等技术根据输入的人脸图像自动估计出其真实年龄。虽然大量的研究者为解决年龄估计问题付出了艰辛的努力,但该问题远远没有得到解决且仍然面临着许多严峻的困难和挑战。

    首先,人的成长是一个连续且缓慢变化的过程,因此,年龄相近的人脸之间的表观差异较小,这导致人工设计有判别力的年龄特征来刻画这些细微的差异是非常困难的;其次,收集大量带有年龄标签的人脸数据是非常昂贵且耗时的,因此,大多数公开的年龄估计数据集都存在样本数量少和年龄分布不均衡的问题,这大大增加了年龄估计算法的训练难度;除此之外,不同种群,也就是说,不同性别、种族的人的年龄成长模式是不同的,这也给年龄估计带来了很大的困难。

    本文基于深度学习技术并围绕年龄估计问题中的上述难点开展了一系列原创性研究,提出了多种有效的深度年龄估计算法。本文的主要工作和贡献概括如下:

    (1)提出了一种混合式多任务深度年龄估计模型。传统的年龄估计算法一般包含两个步骤:第一步是提取人工设计的年龄特征,第二步是利用提取到的特征进行年龄估计模型的训练。这两个步骤是相互独立的,因此模型的性能非常依赖于提取到的年龄特征的好坏。近年来,深度学习在各大主流的计算机视觉任务中都取得了突破性的进展,这得益于其端到端学习特征和分类器的能力。为了克服人工设计年龄特征的困难,我们首次且系统地分析了如何更好地将深度学习应用于年龄估计问题。具体来说,我们从一个简单的基准网络架构开始,逐步地分析了三种不同的年龄估计问题的形式化方式、五种不同的损失函数和三种不同的多任务网络架构。实验结果表明,我们提出的混合式多任务深度年龄估计模型的性能最好,并且在两个较大的年龄估计公开数据集上都取得了当时最好的性能。

    (2)提出了一种基于累积与对比监督信号的深度年龄估计模型。该模型可以用来缓解年龄估计数据集的年龄分布不均衡和样本数量少的问题。首先,我们设计了一种累积隐含层和累积监督信号,即使某年龄所对应的样本数量较少,通过该累积信号网络可以从该年龄的邻域样本中隐式地进行学习,因此可以大大缓解年龄分布不均衡的问题。接下来,我们又设计了一种对比排序层和对比监督信号来辅助网络学习更有判别力的年龄特征,从而进一步提高年龄估计的准确性。对比监督信号是基于样本对的信息定义的,由于同一个样本可以出现在不同的样本对中,这使得网络可以更加充分地利用数据,因此可以在一定程度上缓解年龄估计数据集的样本数量少的问题。我们在两个较大的年龄估计公开数据集上验证了该模型的有效性。

    (3)提出了一种基于代价敏感和序列保持特征学习的深度跨种群年龄估计模型。为了消除性别和种族因素对年龄估计的影响,通常的做法是为每一个种群分别训练一个单独的年龄估计模型,但是为每个种群都收集足量的训练数据是非常困难的。在实际场景中,最可能的情况是:有些种群的样本比较充足,有些种群的样本较少。如果能利用某种群(源种群)已有的大量数据来提高模型在只拥有少量数据的目标种群上的年龄估计性能,这样就可以免去为目标种群收集更多样本的麻烦,我们设计了一种深度跨种群年龄估计模型来达到该目的。具体来说,我们的模型分为两个训练阶段:首先,在源种群的训练集上通过代价敏感的多任务损失来学习可迁移的底层年龄特征;接下来,通过序列保持的特征对齐将源种群和目标种群的高层年龄特征映射到一个统一的年龄特征空间中去。经过这两个阶段的训练,网络可以成功地将从源种群中学到的知识迁移到目标种群中去,进而得到一个在源种群和目标种群上效果均优的深度年龄估计模型。我们在两个较大的年龄估计公开数据集上验证了该深度跨种群年龄估计模型的有效性。

总的来说,本文从不同的角度较好地解决了人脸年龄估计任务中的一些主要难题,提出的算法大幅提升了人脸年龄估计的性能,并在多个人脸年龄估计公开数据集上都取得了当时最好的结果。与此同时,本文提出的人脸年龄估计算法已经在华为技术有限公司得到了实际应用,取得了一定的经济效益。

;

As an important face attribute, age has broad application prospects in the areas of human-computer interaction, intelligent business, security surveillance and entertainment. Automatic human age estimation, as an important biometric recognition technology, has become a hot research topic in the field of pattern recognition and computer vision. The definition of human age estimation is to automatically estimate the true age based on the input face image by using computer vision and other technologies. Although many researchers have made arduous efforts to solve the human age estimation problem, it is far from being solved and still faces many severe difficulties and challenges.

First of all, human growth is a continuous and slowly changing process. Therefore, the appearance differences of faces with similar ages are relatively small, which makes it difficult to manually design discriminative aging features to describe these subtle differences. Second, collecting large numbers of face images with age labels is very expensive and time-consuming. Therefore, most of the public age estimation datasets have the problem of small sample size and imbalanced age distribution, which greatly increases the difficulty of human age estimation. In addition, different populations, that is, people of different genders and races, have different aging patterns, which also brings great difficulties to age estimation.

    In this thesis, focusing on the above-mentioned difficulties in the age estimation problem, we conduct a series of original research based on the deep learning technology, and propose a variety of effective deep age estimation algorithms. The main contributions of this thesis are summarized as follows:

    (1) We propose a hybrid multi-task deep age estimation model. The traditional age estimation algorithms generally include two steps: the first step is to extract the manually designed aging features, and the second step is to use the extracted features to train the age estimation model. These two steps are independent of each other, so the performance of the model is heavily dependent on the quality of the extracted aging features. In recent years, deep learning has made breakthroughs in major mainstream computer vision tasks, thanks to its ability to learn features and classifiers end-to-end. In order to overcome the difficulty of manually designing aging features, we systematically analyze how to better apply deep learning to the age estimation problem. Specifically, starting from a simple baseline network architecture, we have gradually analyzed three different formulations of the age estimation problem, five different loss functions, and three different multi-task network architectures. The experimental results show that our proposed hybrid multi-task deep age estimation model has the best performance and obtains the best results on two large public age estimation datasets.

    (2) We propose a deep age estimation model based on cumulative and comparative supervision signals. This model can be used to alleviate the problem of imbalanced age distribution and small sample size in age estimation datasets. First, we design a cumulative hidden layer and a cumulative supervision signal. Even if the number of samples corresponding to an age is small, by using this cumulative signal, the model can implicitly learn from faces with nearby ages, so that the problem of imbalanced age distribution can be greatly alleviated. Next, we further propose a novel comparative ranking layer and a comparative supervision signal to help the network learn more discriminative age features, thereby further improving age estimation performance. The comparative supervision signal is defined based on the information of the sample pair. Since the same sample can appear in different sample pairs, this allows the network to make full use of the training data, so it can alleviate the small sample size problem to some extent. We have validated the effectiveness of the model on two large public age estimation datasets.

    (3) We propose a deep cross-population age estimation model based on cost-sensitive and order-preserving feature learning. To deal with the influence of gender and race on age estimation, it is common practice to train an age estimation model for each population separately, but it is very difficult to collect sufficient training data for each population. In practice, the most likely situation is that there are sufficient samples for some populations and fewer samples for other populations. Instead of resorting to labeling more data, it is better to exploit the existing large sized training data of one (source) population to improve the age estimation performance on another (target) population for which only a small sized set of training data is available. We design a deep cross-population age estimation model to achieve this goal. In particular, our model develops a two-stage training strategy. First, a novel cost-sensitive multi-task loss function is designed to learn transferable low-level aging features by training on the source population. Second, the high-level aging features of the source population and the target population are mapped into a unified aging feature space through the order-preserving feature alignment stage. By doing so, our model can successfully transfer the knowledge learned from the source population to the target population, and then obtain a deep age estimation model with good performances on both the source population and the target population. We validate the effectiveness of our deep cross-population age estimation model on two large public age estimation datasets.

    In summary, this thesis solves some of the major difficulties in the human age estimation problem from different perspectives. The proposed algorithms greatly improve the performance of human age estimation and have obtained the best results on multiple public human age estimation datasets. At the same time, the human age estimation algorithms proposed in this thesis have been practically applied by Huawei Technologies Co., Ltd. and have achieved certain economic benefits.

文献类型学位论文
条目标识符http://ir.ia.ac.cn/handle/173211/20952
专题毕业生_博士学位论文
作者单位中国科学院自动化研究所
推荐引用方式
GB/T 7714
李凯. 基于深度学习的自动人脸年龄估计研究[D]. 北京. 中国科学院大学,2018.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
Thesis签名版.pdf(9812KB)学位论文 暂不开放CC BY-NC-SA请求全文
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[李凯]的文章
百度学术
百度学术中相似的文章
[李凯]的文章
必应学术
必应学术中相似的文章
[李凯]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。