|Place of Conferral||北京|
|Keyword||深度学习 人脸检测 表情识别 网络压缩 深度卷积神经网络 人脸关键点定位 多任务学习|
（1）提出了一种基于区域卷积神经网络Faster R-CNN的人脸检测方法。利用深度卷积神经网络对小尺度人脸提取特征时，会使得该特征具有较强的语义信息表达能力，但是特征的分辨率太低，从而导致产生人脸检测错误。为了解决人脸检测中的小目标人脸和多尺度问题，本文提出了分步式的人脸检测方法。该方法分为两个阶段：第一阶段，提出了一种高效的基于级联Boosting人脸检测器的多任务RPN网络，以提高人脸候选区域的提取效率和回召率。第二阶段，提出了一种基于人脸候选区域尺度的并联式Fast R-CNN网络，针对不同候选区域的尺度进行分组，分别利用三个对应的Fast R-CNN网络进行检测，实现了针对人脸目标尺度特性的人脸检测，有效地提高了人脸检测精度。
Face detection and facial expression recognition are the essential parts of human-computer interaction and have a wide range of application prospects in many fields. In recent years, with the continuous development of deep learning methods in the face-related fields, face detection and facial expression recognition technologies are extensively concerned by researchers and have become a hot research topic in the field of computer vision and pattern recognition. With the increasing demand for practical applications, face detection and facial expression recognition still confront many challenges in complex scenarios such as face pose changes, lighting changes, scale changes, occlusion changes, and identity information changes. Complex situations in these unconstrained environment can lead to poor stability in face detection and facial expression recognition, thereby reducing the application value of face technology. How to achieve high performance and efficient face detection and facial expression recognition has become an important research topic. This article focuses on the key issues of face detection and facial expression recognition based on deep learning.The main work and contributions are as follows:
(1) A method of face detection based on the regional convolutional neural network Faster R-CNN is proposed. When using deep convolutional neural network to extract features of small scale face, it will have strong ability of expressing semantic information, but the resolution of these features is too low, which will lead to face detection errors. In order to solve the small and multi-scale face problems in face detection, this paper proposes a step-by-step face detection method. The method is divided into two stages: In the first stage, an efficient multi-task RPN network based on cascaded Boosting face detector is proposed to improve the extraction efficiency and recall rate of face proposals. In the second stage, a parallel-type Fast R-CNN network based on the scale of proposal is proposed. The different face proposals are grouped according to the scale, and three corresponding Fast R-CNN networks are used for detection. This method realizes the face detection bsased on the scale of face and effectively improves face detection accuracy.
(2) A network compression and acceleration method based on the combination of efficient convolutional neural network, filter pruning and binarization network parameters is proposed. Due to the large amount of parameters and computation of the deep convolutional neural network, the application range of the face detection method is limited. In order to solve the problem of detection speed in face detection, three network compression and acceleration methods and their fusion strategies are proposed. A high-efficiency convolutional neural network based on group-point convolution is used to simplify the parameter from the network structure itself. The filter pruning method based on the approximate Hessian matrix is proposed. The Hessian matrix is used to estimate the low-sensitivity filter and prune it, which can effectively reduce the memory occupation and improve the forward propagation speed of the network. A simplified method based on binarization network parameter is used to compress the original network by reducing the number of bits needed to represent each weight. Using the Fraser R-CNN step-by-step detection framework, the fused acceleration network is applied to face detection tasks. Through elimination experiments, it is verified that a variety of network compression and acceleration strategies can be effectively combined, and making the network achieve a better balance between speed and accuracy.
(3) A multi-pose expression recognition method based on special landmark detection is proposed. As the face can be regarded as a kind of convex spherical structure, the pose of the human face will lead to the occurrence of self-occlusion, which will make the facial expression features different, and thus affect the accuracy of facial expression recognition. In order to solve the pose problem, a special landmark detection method based on convolutional neural network is proposed. The geometric relationship between special landmarks is used to estimate the pose of the human face. The projection method of ROI and concatenation method of feature maps based on face pose is proposed, which makes different face poses correspond to different feature map concatenation weights. So that the expression recognition network has self-adaptive ability for pose. The loss function based on intra-class distance and inter-class distance is proposed to increase the distance between classes while reducing the distances between the sample features and the class center. As the result, the distinction between different expression features is enhanced.
(4) A facial expression recognition method based on identity information enhancement is proposed. The change of identity information will lead to the confusion of expression recognition, which not only makes the same expression have greater differences, but also brings certain similarities between different expressions. In order to solve the problem of decrease in facial expression recognition rate caused by the change of identity information, a method using identity information to enhance the discriminability in the process of supervised learning of facial expressions is proposed to realize the adaptability of the facial expression recognition network to different identity information. It is proposed that the identity information and facial expression features are effectively combined by spatial fusion, and then the multi-task learning based on constraints is used to enhance the identity information contained in the facial expression features. The identity information is fused into expression recognition tasks in this method, which can effectively improve the accuracy of face expression recognition.
|First Author Affilication||Institute of Automation, Chinese Academy of Sciences|
|武文琦. 基于深度学习的人脸检测及表情识别方法研究[D]. 北京. 中国科学院研究生院,2018.|
|Files in This Item:|
|基于深度学习的人脸检测及表情识别方法研究（6529KB）||学位论文||暂不开放||CC BY-NC-SA||Application Full Text|
|Recommend this item|
|Export to Endnote|
|Similar articles in Google Scholar|
|Similar articles in Baidu academic|
|Similar articles in Bing Scholar|
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.