基于图像分析的视线估计关键技术研究

CASIA OpenIR > 毕业生 > 博士学位论文

	基于图像分析的视线估计关键技术研究
其他题名	Research on Technology of Gaze Estimation based on Image Analysis
	熊春水
	2015-06-01
学位类型	工学博士
中文摘要	视线估计在心理学、市场/广告分析、医学研究、人机交互等众多领域有着广阔的应用前景，受到了研究者们的广泛关注，已成为计算机视觉、模式识别和人机交互领域的热点研究课题。目前，在特定或可控的环境下，视线估计已经达到了比较满意的效果，基本上可以满足部分实用需求；然而视线估计依然存在标定过程复杂、使用环境受限、头部运动受限等问题，难以满足大部分实际应用的要求，从而限制了其在实际场合中的应用与推广。本文对多种不同设备条件下的基于图像分析的视线估计问题进行了深入的研究与分析，针对其中的关键问题和难点，提出了相应的解决方法。主要的工作和贡献有： 1. 基于单相机单光源的视线估计。在单相机单光源条件下，针对现有视线估计方法标定过程复杂的问题，本文提出了一种新的单点标定视线估计方法。该方法预先建立屏幕中多个点的视线估计统计模型，进而通过插值估计用户在屏幕中的视点。主要创新工作有: 1）提出一种由粗到精的瞳孔定位算法，提高了瞳孔定位的鲁棒性； 2）提出一种基于统计的单点标定视线估计模型，降低了标定过程的复杂度； 3）采用增量学习方法进一步更新模型，提高模型对不同用户以及头部运动的适应性。本方法在设备简单、允许头部运动的前提下，只需单点标定，就能够取得较高精度。 2. 基于单相机双光源的视线估计。在单相机双光源条件下，针对视线特征表达的问题，本文提出了一种混合视线描述子，该描述子从眼睛外观表达与眼球运动表达两方面全面地描述了眼睛在视线变化中的状态。首先，提出一种双亮斑归一化的外观描述子，用于表征眼睛的灰度、轮廓等外观特征。其次，实现了一种基于瞳孔中心亮斑中心（Pupil center-cornea reflections，PCCR）向量的多项式描述子，用于表征眼球的运动信息。最后，基于偏最小二乘的特征选择与融合技术将这两个描述子融合成一种混合描述子。该混合描述子有效地提高了视线估计的性能，并且对头部运动有更好的适应性。 3. 基于双相机的视线估计。针对基于特征的视线估计方法对亮斑强烈依赖的问题，本文提出了一种基于三维人脸结构与瞳孔中心的视线估计方法。首先，利用立体视觉技术，重建人脸三维主动形状模型用于表达人脸三维结构。其次，提出一种基于三维瞳孔中心与眼睛轮廓的特征，用于表达人眼视线信息。最后，引入头部姿态估计用于对三维瞳孔中心与眼睛轮廓进行校正，提高了该方法对头部运动的适应性。通过在搭建的双红外相机视线估计系统和双自然光相机视线估计系统下的实验，验证了该方法的有效性。 4. 基于视线估计的婴幼儿视力自动检测系统。针对婴幼儿无法主动配合完成视线估计中标定过程的问题，本文设计了一种无标定的视线估计方法。该方法用基于PCCR向量的多项式描述子作为特征，用高斯混合模型进行视线行为判别。本文所设计系统具有非侵入式、无需标定以及高正确率等优点，因此可以用于代替婴幼儿视力检测中观测者对婴幼儿视线行为的人为判断。这也是首次将基于图像分析的视线估计方法应用于婴幼儿视力自动检测，并且成功用于临床试验。本论文的上述研究内容对于视...
英文摘要	Gaze estimation has received a great deal of attentions, due to its wide range of applications including studies of psychology, market and advertising analysis, medical researches, human-computer interactions, etc. It has become a hot topic in pattern recognition and human-computer interaction. With some published papers and commercial systems, gaze estimation under well controlled environment is relatively mature, providing high accuracy and meeting some kinds of practical applications. However, existing gaze estimation techniques still have many limitations, such as complicated calibration process, restricted environment and head movements. In this thesis, the issues of gaze estimation based on image analysis under several kinds of non-invasive devices are studied. This work focuses on the main difficulties in gaze estimation research and attempts to provide practical solutions. The main contributions of this thesis include the following issues: 1. Gaze estimation under condition of single camera and single light source. To solve the problem that the calibration procedure in most current gaze estimation methods are tedious when only single camera and single light source are used, we propose a novel gaze estimation method with one-point calibration. In our approach, statistical models of multiple points on the screen are built in advance and interpolation-based method is used to estimate the PoR(Point of regard) of the user on the screen. The main contributions of this paper are: 1) a coarse-to-fine algorithm for pupil location is presented, which improves the adaptability to eye glasses, eyelashes, occlusion and image blurring; 2) a novel one-point calibration gaze estimation model based on statistical method is proposed, which reduces the complexity of the calibration procedure; 3) incremental learning method is used to update the model, which could improve the adaptability of different users and head movements. The proposed method is effective for different users with different head movements. 2. Gaze estimation under condition of single camera and two light sources. Existing appearance-based and feature-based methods both have achieved impressive progress in the past several years, while their improvements are still limited by feature representation. Therefore, we propose a novel descriptor combining eye appearance and pupil center-cornea reflections (PCCR). The hybrid gaze descriptor represents eye structure from both feature level and topolog...
关键词	视线估计单点标定瞳孔定位增量学习 Pccr多项式描述子混合视线描述子三维人脸结构婴幼儿视力检测 Gaze Estimation One-point Calibration Pupil Location Incremental Learning Polynomial Descriptor Of Pccr Hybrid Gaze Descriptor 3d Face Structures Visual Acuity Measurement In Human Infants
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/6741
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	熊春水. 基于图像分析的视线估计关键技术研究[D]. 中国科学院自动化研究所. 中国科学院大学,2015.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
CASIA_20121801462807（8990KB）			暂不开放	CC BY-NC-SA