CASIA OpenIR  > 毕业生  > 博士学位论文
交通对象检测与分析的若干问题研究
苟超
学位类型工学博士
导师王飞跃 ; Ji Qiang
2017-06
学位授予单位中国科学院大学
学位授予地点北京
关键词由粗到细 车牌识别 关键点检测 头部姿态估计 人眼检测
其他摘要
基于视频图像处理的交通对象检测与分析是智能交通系统研究的重要组成部分。随着视频监控硬件技术和视频图像处理软件技术的快速发展,智能交通视频监控分析受到了广泛关注,并开展了大量研究和实际应用。但在复杂交通场景下,交通对象检测与分析仍然存在未能较好解决的难点问题。

        本论文中,在由粗到细的目标检测及分析框架下,提出了新的交通对象检测与分析方法来有效解决背景杂乱、交通对象姿态多变和光照多样等难点问题。本文选取了车牌和驾驶员两种典型的交通对象作为研究目标,主要工作及创新点包括以下几个方面:
 
       1. 本文逐层深入地提出基于极值区域(Extremal Regions,ERs)和混合判别限制玻尔兹曼机(HDRBM)的车牌检测及识别方法。在由粗到细(Coarse-to-fine)框架基础上,提出先根据车牌区域的纹理和颜色特性来提取图像中可能的车牌区域作为车牌待选区域,即实现车牌粗定位。然后在车牌待选区域的多个颜色通道上进行极值区域提取并将这些区域组合,即实现车牌字符的粗定位。接着利用AdaBoost来计算每个待选字符区域的字符概率,根据条件概率值来删选车牌待选字符区域,从而实现车牌字符的细分割,并首次提出利用混合判别限制玻尔兹曼机来识别车牌。本文提出的这种由粗到细的车牌检测识别方法快速且鲁棒。由于本文提出的方法是基于单个车牌字符条件概率来实现车牌细定位,所以该方法能够有效应对交通场景下车牌姿态变化等问题。由于是在多个颜色通道下提取极值区域作为车牌字符待选区域,提出的方法能够应用于复杂交通场景下的多变光照环境。本文在全天候实际交通监控场景下拍摄的大量图片中进行了定性和定量的测试实验,并和几种已有车牌识别方法进行比较,提出的方法能实现较好的车牌检测和识别率。

       2. 提出了一种新的耦合级联回归方法(Coupled Cascade Regression,CCR)来实现驾驶员脸部关键点检测和头部姿态估计。和传统先检测脸部关键点位置再根据关键点位置来拟合三维脸部形变模型估计头部姿态的方法不同,提出的CCR能同时实现脸部关键点的检测和头部姿态估计。本文中,同样基于由粗到细(Coarse-to-fine)的思想,首先通过人脸检测实现驾驶员脸部关键点(眼角点、鼻梁、嘴角点等)的粗略定位,然后提出耦合级联回归(Coupled Cascade Regression,CCR)来迭代更新脸部关键点位置及对应三维脸部模型中的姿态参数直至收敛,从而实现关键点细定位。在每一步迭代更新步骤中,CCR将机器学习和三维脸部形变模型结合,能够同时优化关键点提取和头部姿态估计两个任务。CCR的强大学习能力得益于级联回归,而拟合三维脸部形变模型能够很好获取头部姿态和关键点位置之间的潜在映射关系。提出的CCR简单且高效,而且大量试验证明提出的方法比已有的基于传统级联回归方法都准确,满足实际驾驶员监控系统的实时性准确性要求。

        3. 提出了一种新的级联联合回归方法(Joint Cascade Regression,JCR)来同时实现驾驶员眼中心检测及睁眼闭眼状态识别。当完成驾驶员脸部关键点检测后,基于Coarse-to-fine的思想,根据脸部关键点来提取大致的眼部区域,即实现人眼的粗定位。和传统先检测人眼位置再进行人眼二值状态识别不同,提出级联联合回归(Joint Cascade Regression,JCR)来同时实现驾驶员眼中心的检测和状态的识别。提出将二值状态(睁眼/闭眼)平滑为0到1的睁眼概率值,在级联联合回归的每一步迭代步骤中,根据当前计算的局部图像特征来更新人眼中心和睁眼概率,然后将睁眼概率作用于下一步的图像特征上,从而实现了闭眼状态下纹理外观特征对人眼中心检测无影响的数学表达。基于人眼关键点的眼中心检测方法需要大量带标签的训练数据集,提出用仿真人眼来学习模型(Learning-by-Synthesis),从而优化检测结果。和几种已有方法相比,在大量公开数据集取得了最好的试验结果,这也验证了提出方法的有效性,同时,提出的算法每秒能处理15张图像,满足实时性要求。
;
The video image processing technology for traffic object detection and analysis is an important part of intelligent transportation system research. With the rapid development of video surveillance hardware and video image processing software technology, intelligent transportation video surveillance analysis has drawn wide attention and began a lot of research and practical applications. But there are still a lot of challenging problems to be well solved for the object detection and analysis in the complex traffic scenarios.
 
     In this paper, novel methods for traffic object detection and analysis based on the Coarse-to-fine framework are proposed. The proposed methods can deal with complex traffic background disorder, different poses and various illuminations. Two typical traffic objects of the license plate (LP) and driver are chosen as our research target. The main work and innovative points are as follows:

       1. A novel LP detection and recognition method based on the Extremal Regions (ERs) and Hybrid Discriminative Restricted Boltzmann Machine (HDRBM) is proposed. On the basis of Coarse-to-fine object detection and analysis framework, we propose the extract regions as LP candidates based on the texture and color characteristics of the LP. As a result, we achieve the coarse license plate detection. Then we extract the ERs in each channel of colorspace and combine them together, followed by a suitable selection of ERs based on the conditional character probability calculated through AdaBoost classifier. As a result, we achieve the fine LP detection and segmentation of LP characters. Finally, we integrate HDRBM for LP character recognition. Our proposed Coarse-to-fine LP recognition method are robust and effective. Our proposed method are robust to different LP poses in complex traffic scenarios due to its suitable selection of ERs as single LP characters. In addition, different from the conventional methods that extract LP regions in the gray image, our proposed method are robust to various illuminations due to the extraction of ERs in each channel of color space. Qualitative and quantitative experiments on a large number of testing images taken in the real traffic scenarios are conducted. We also compare with other state-of-the-art methods and experimental results show that the method can achieve preferable performance.

       2.  Coupled Cascade Regression (CCR) for simultaneous facial landmark detection and head pose estimation is proposed. Different from the conventional methods that perform landmark detection fist, followed by fitting the 3D face deformable model to estimate the head pose, our proposed CCR can simultaneously detect facial landmark and estimate head pose. In this paper, on the basis of the Coarse-to-fine framework, we propose to detect face and achieve the coarse facial landmark (e.g. corner of eye, mouth and nose tip) detection by mean face initialization first. Then we propose the CCR to iteratively update the facial landmark location and head pose parameters until convergence to achieve fine landmark detection. At each cascade level, CCR combines the learning and 3D face model projection and it can simultaneously optimize the two tasks of facial landmark detection and head pose estimation. For our proposed CCR, the power of learning comes from the cascade regression step, and the 3D face model projection model can leverage the relationship among facial landmark locations and head pose to ensure their consistency. A large number of experiments show that CCR can achieve preferable results compared other conventional methods both for landmark detection and head pose estimation. In addition, CCR can be applied for real-time driver monitoring.

      3. Another novel method named Joint Cascade Regression (JCR) for simultaneous driver eye detection and eye state estimation is proposed. After the detection of the facial landmark, on the basis of the Coarse-to-fine framework, we can extract the eye regions. That means we can achieve coarse eye detection based on the facial landmark detection. Different from conventional methods that first perform eye center detection followed by eye state estimation, we propose JCR to simultaneously detect eye center and estimate eye state. The proposed method softs the binary eye state (close/open) to be the eye openness probability which ranges from 0 to 1. At each cascade level of JCR, we iteratively update the eye openness probability and eye center locations based the local appearance features. Since when an eye is closed with low visibility, the local feature of eye center is less reliable for final eye detection, we perform dot production on eye openness probability and related eye center appearance features. Learning-by-synthesis is proposed for training to handle the challenging problems of manual labeling eye center for eye images. Quantitative and qualitative experimental results show that our proposed method performs the best. IT achieve an FPS of 15 and can be applied for real-time driver eye detection and analysis.
文献类型学位论文
条目标识符http://ir.ia.ac.cn/handle/173211/14686
专题毕业生_博士学位论文
推荐引用方式
GB/T 7714
苟超. 交通对象检测与分析的若干问题研究[D]. 北京. 中国科学院大学,2017.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
gouc-phd-thesis-full(5748KB)学位论文 暂不开放CC BY-NC-SA请求全文
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[苟超]的文章
百度学术
百度学术中相似的文章
[苟超]的文章
必应学术
必应学术中相似的文章
[苟超]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。