跨场景大规模人脸识别关键问题研究
刘浩
2021-05
页数116
学位类型博士
中文摘要

人脸识别作为身份认证的重要生物识别技术已广泛用于门禁、考勤、通关、金融、社保等许多领域。从研究角度来看,人脸识别作为经典的模式识别问题有着悠久的研究历史,在表示学习领域有着重要的地位。从应用角度来看,人脸识别在日常生活中有着广泛的应用,是人工智能技术发展给生活带来便利的典型代表。因此,开展人脸识别这项研究具有重要的理论意义和应用价值。

随着深度学习的到来,人脸识别技术随着网络架构的发展、训练数据的增大、优化方法的改进取得了长足的进步。然而目前学术界公开的人脸识别数据场景较为单一,基本上都是来自网络名人的人脸数据,并且数据量规模较小,与实际应用中多场景大规模的数据相比仍有较大差距。如何在跨场景大规模人脸数据上提升人脸识别性能是亟需解决的关键问题。本文以跨场景大规模人脸识别为研究重点,在应对大规模数据、非均衡数据、跨场景和多场景防遗忘问题这些方面进行了深入的探究,解决现有算法在超大规模数据上高效训练、在大规模非均衡数据上有效训练和在跨场景及多场景数据上防止灾难性遗忘的问题,扩展并完善了在跨场景大规模数据上的人脸识别算法。论文的主要贡献点包括以下几个方面:

  • 提出了一个有效的大规模人证场景的训练框架。针对实际中人证场景存在的大规模双样本数据,本文提出一个三阶段的训练框架,通过分类-验证-分类三阶段逐步将模型迁移到人证场景并大幅提升模型在人证场景下的识别性能,最终在真实人证测试集、公开人证测试集和模拟大规模双样本测试集上都取得了发表之时相应数据库最高的性能。
  • 提出了一个大规模类别训练损失函数DP-Softmax。针对之前的分类模式由于计算资源的限制无法应对超大规模类别分类的问题,本文通过选择与当前样本最相似的类别来完成每次迭代的分类训练,使得在有限的计算资源下能够进行大规模类别的分类训练,最终在大规模人证数据上取得了明显的性能提升。
  • 提出了一个针对大规模非均衡人脸数据的训练方法AdaptiveFace。针对现有人脸识别方法无法有效应对非均衡数据的问题,本文分别在损失函数和采样上提出了自适应间隔损失函数和自适应数据、类别采样方法解决大规模非均衡人脸数据对训练的影响,最终在各个学术测试集上都取得了发表之时相应数据库最高的性能,并且提升了训练效率。
  • 提出了一个防遗忘的跨场景人脸识别方法。针对在实际跨场景人脸识别任务中性能提升和灾难性遗忘的问题,本文提出了一种能够应对连续跨场景取得高性能并且同时能够保留之前场景性能的方法,在保留少量源域样例的情况下使得模型在迁移之后源域性能几乎不下降,最终在连续多个场景的迁移过程中在目标域和源域上都取得了发表之时相应数据库最高的性能。
  • 提出了一种无需保留样例的跨场景防遗忘识别方法。针对跨场景及多场景迁移问题,本文提出一种无需保留源域样例的方法来使得模型在快速提高目标域性能的同时尽可能保留源域的性能,通过使用模型自身的信息及样本特征信息来维持模型在源域的性能,最终在不保留源域样例的情况下连续多个场景的迁移过程中相比之前的方法在源域和目标域上的性能都取得了明显的提升。
英文摘要

Face recognition as an important biometric technology for identity authentication has been widely used in many fields such as access control, attendance, customs clearance, finance, and social security. From the research perspective, face recognition has a long research history as a classical pattern recognition problem and has an important position in the field of representation learning. From the application perspective, face recognition has a wide range of applications in daily life and is a typical representative of artificial intelligence technology development bringing convenience to life. Therefore, it is of great theoretical significance and application value to carry out this research on face recognition. 

With the arrival of deep learning, face recognition has made great progress with the development of network architecture, the increase of training data, and the improvement of optimization methods. However, the current face recognition data scenes publicly available in academia are relatively single, basically face data from celebrities, and the data scale is small, which still has a large gap compared with the large-scale data of multiple scenes in practical applications. How to improve face recognition performance on large-scale cross-scene data is the key problem that needs to be solved urgently. This thesis focuses on cross-scene large-scale face recognition, and conduct an in-depth investigation on coping with large-scale data, unbalanced data, cross-scene, and multi-scene anti-forgetting problems, solving the problems of efficient training of existing algorithms on large-scale data, effective training on large-scale unbalanced data, and preventing catastrophic forgetting on cross-scene and multi-scene data, extending and improving the face recognition algorithm on cross-scene large-scale data. The main contributions of this thesis are as follows:

  • Proposing an effective training framework for large-scale ID vs. spot scenario. For the large-scale bisample data in the ID vs. spot scenario, this thesis proposes a training framework, which gradually transfers the model to the ID vs. spot scenario and significantly improves the recognition performance through three stages of classification-validation-classification, and finally achieves the state-of-the-art performance on the real ID vs. spot test set, the public test set, and the simulated large-scale bisample test set.
  • Presenting a method DP-Softmax to cope with large-scale classes classification. To address the problem that the previous classification methods cannot cope with large-scale classes classification due to the limitation of computational resources, this thesis proposes a new loss function to complete the classification training of each iteration by selecting the classes which are most similar to the current samples. It enables the classification training of large-scale classes with limited computational resources and finally achieves significant performance improvement on large-scale ID vs. spot data.
  • Introducing a method AdaptiveFace for large-scale unbalanced data. To address the problem that existing face recognition methods cannot effectively cope with unbalanced data, this thesis proposes an adaptive margin softmax loss function and adaptive data and class sampling methods on loss function and sampling respectively to cope with the training of large-scale unbalanced face data, and finally achieves the state-of-the-art performance on the common academic dataset and improves the training efficiency. 
  • Proposing a cross-scene face recognition and anti-forgetting method. To address the problems of performance improvement and prevention of catastrophic forgetting in real cross-scene face recognition tasks, this thesis proposes a method that can cope with continuous cross-scene task while preserving the performance of the previous scenes, so that the performance of the model in the source domain hardly degrades after transferring with a small number of samples in the source domain, and finally achieves the state-of-the-art performance in both the target and source domains.
  • Introducing a cross-scene anti-forgetting method without preserving any samples. To address the problem of cross-scene and multi-scene transferring, this thesis proposes a method without preserving the previous domain samples to make the model improve the performance of the target domain rapidly while preserving the performance of the source domain as much as possible. It maintains the performance of the source domain by using the model's own information and sample feature information. Finally, the transfer process of multiple consecutive scenes without preserving the source domain samples achieves a significant performance improvement over the previous methods on both the source and target domains.
关键词人脸识别 大规模分类 非均衡数据 跨场景识别 灾难性遗忘
语种中文
七大方向——子方向分类生物特征识别
文献类型学位论文
条目标识符http://ir.ia.ac.cn/handle/173211/44848
专题多模态人工智能系统全国重点实验室_生物识别与安全技术
推荐引用方式
GB/T 7714
刘浩. 跨场景大规模人脸识别关键问题研究[D]. 中国科学院自动化研究所. 中国科学院自动化研究所,2021.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
跨场景大规模人脸识别关键问题研究.pdf(45218KB)学位论文 开放获取CC BY-NC-SA
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[刘浩]的文章
百度学术
百度学术中相似的文章
[刘浩]的文章
必应学术
必应学术中相似的文章
[刘浩]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。