复杂驾驶场景下协同式的人眼检测及视线估计方法研究

CASIA OpenIR > 毕业生 > 硕士学位论文

	复杂驾驶场景下协同式的人眼检测及视线估计方法研究
	曹琳
	2019
页数	94
学位类型	硕士
中文摘要	为了减少交通事故，驾驶监测技术研究引起了学术界的广泛关注。人眼检测及视线估计是驾驶监测研究的重要组成部分，开展复杂驾驶场景下的人眼检测及视线估计方法研究具有重要意义。目前，在特定环境或环境可控时，人眼检测及视线估计的方法和成果已有很多。但是，对于复杂驾驶场景，光照不均、距离变化、背景杂乱、姿态偏转、表情变化等因素增加了检测人眼及视线的难度；现有模型没有考虑人眼关键点和视线之间的关系；过多关注模型精度而忽略了模型参数量等问题，都限制了人眼检测及视线估计的实际应用和推广。本论文以复杂驾驶场景下驾驶员眼睛为研究对象，对人眼检测及视线估计方法进行了较为深入的研究与分析，针对其中的关键问题和难点，提出相应的解决方法。主要研究内容如下：首先，本文研究了复杂驾驶场景下基于面部关键点的眼睛区域定位。针对图像中直接定位眼睛区域准确度低的情况，本文在人脸检测算法的基础上，先对人脸进行检测，然后基于脸部区域进行人脸关键点定位，之后根据关键点相对位置提取眼部区域。该方法简单有效且易于实现，实验结果表明在光照变化、部分遮挡的场景下仍可以获取眼睛区域。接着，本文提出基于级联联合回归模型的协同式人眼检测及视线估计。针对人眼区域内人眼关键点和视线之间的关系建模问题，本文提出级联联合回归模型，利用多特征融合策略实现二者之间的协同关系建模，同时实现人眼检测和视线估计任务。在多种测试数据集上的结果显示，当检测误差在瞳孔半径内，此模型的人眼检测能够取得3%左右的精度提升，视线估计误差能够降低1度左右。实验结果验证了本文提出的级联联合回归模型的有效性，表明了级联联合回归在复杂驾驶场景下实现协同式人眼检测及视线估计的应用潜力。最后，本文研究了基于深度堆叠沙漏网络模型的协同式人眼检测及视线估计。为了保证精确度的同时降低模型参数量，提出一种新的回归模型——深度堆叠沙漏网络，根据模型子网络间任务的差异性，堆叠不同尺度的子模块组成沙漏网络进行人眼检测。同时根据眼球结构计算视线方向，从而实现人眼检测及视线估计。研究中采用了对抗网络生成的大量图像训练模型。在多种测试数据集上的实验结果显示，此方法人眼检测精度可以达到99.3%，视线估计误差降至9.5度，保证准确度的同时减少了模型参数量。综上所述，本文围绕复杂驾驶场景下人眼检测和视线估计中的关键问题，研究了眼睛区域定位，设计了线性级联联合回归、非线性深度堆叠沙漏网络模型实现协同式的人眼检测及视线估计。本文的研究内容一定程度上能提高人眼检测精度，降低视线估计误差，对实际驾驶场景下监测技术研究有重要意义。
英文摘要	In order to reduce traffic accidents, the research on driving monitoring technology has attracted wide attention. Eye detection and gaze estimation are important parts of driving monitoring research. It is of great significance to study eye detection and gaze estimation method in complex driving scenes.At present, there are many methods and results of eye detection and gaze estimation in a specific environment or when the environment is controllable. However, for complex driving scenes, those factors like illumination, distance, background, angle and expression increase the difficulty of eye detection and gaze estimation. Moreover, there are still some problems in the current research. For example, existing models do not consider the relationship between eye-related key points and gaze, and pay too much attention to accuracy of model and neglect the number of parameters, which restrict practical application and promotion of eye detection and gaze estimation. In this dissertation, the driver's eyes in complex driving scenes are taken as research object. The methods of eye detection and gaze estimation are studied and analyzed. To address aforementioned problems and challenges, the corresponding solutions are put forward. The main research contents are as follows: 1. In this dissertation, eye region detection based on face landmarks in complex driving scenes is studied. For the low accuracy of directly detect eye region in image, this dissertation is on the basis of existing algorithm for face detection, then face landmarks are marked based on face region, and then the eye region is extracted according to the relative position of face landmarks. This method is simple, effective and easy to implement. It can detect eye region in the scenes of complex light and partial occlusion. 2. In this dissertation, cooperative eye detection and gaze estimation based on cascaded joint regression model is proposed. Aiming at the problem of modeling the relationship between eye-related key points and gaze, cascaded joint regression algorithm is proposed in this dissertation. It uses multi-feature fusion strategy to realize the cooperative relationship modeling, and realizes eye detection and gaze estimation simultaneously. The results on test datasets show that when detection error is within the pupil radius, accuracy of eye detection can be improved by about 3% and gaze estimation error can be reduced by about 1 degree. The experimental results validate effectiveness of proposed cascade joint algorithm, and show great potential of cascade joint regression algorithm in the application of eye detection and gaze estimation in complex driving scenes. 3. In this dissertation, cooperative eye detection and gaze estimation based on deep stacked hourglass network model is studied. In order to ensure accuracy and reduce the number of parameters, deep stacked hourglass network is proposed. According to the difference of tasks among sub-networks, different scale sub-modules are stacked. At the same time, eye model is established to estimate gaze, so as to realize eye detection and gaze estimation. A large number of images generated by improved SimGAN are used to train deep stacked hourglass network. The experimental results on test datasets show that accuracy of eye detection and gaze estimation error can reach 99.3% and 9.5 degrees respectively, and the number of parameters can be reduced. In summary, this dissertation focuses on eye region detection, linear cascade joint regression and non-linear deep stacked hourglass network model to achieve cooperative eye detection and gaze estimation. The proposed method can improve the accuracy of eye detection and gaze estimation. It is of great significance to research on monitoring technology in actual driving scenes.
关键词	复杂驾驶场景级联联合回归堆叠沙漏网络人眼检测视线协同估计
语种	中文
七大方向——子方向分类	人工智能+交通
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/23905
专题	毕业生_硕士学位论文
推荐引用方式 GB/T 7714	曹琳. 复杂驾驶场景下协同式的人眼检测及视线估计方法研究[D]. 中国科学院自动化研究所. 中国科学院大学,2019.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
Thesis.pdf（3791KB）	学位论文		限制开放	CC BY-NC-SA