基于深度学习的智能驾驶危险目标检测

CASIA OpenIR > 毕业生 > 博士学位论文

	基于深度学习的智能驾驶危险目标检测
	陈亚冉1,2
	2018-05-29
学位类型	工学博士
英文摘要	智能汽车中的辅助驾驶技术可以在危险情况下提醒驾驶员，并在特定条件下代替驾驶员操作车辆，减轻驾驶员负担，进而减少因驾驶员疲劳而产生的事故。智能汽车首先面临环境感知的难题，如何在复杂交通场景下精准的感知周围车辆，行人等目标，发现危险情况，仍面临巨大挑战。本文针对智能驾驶过程中危险目标的识别问题进行了研究。目前常见的危险目标识别的方法是基于多传感器融合的方法，典型的是通过摄像机识别障碍物，结合雷达测量障碍物距离，进而判断此障碍物是否对本车有危险。此方法不仅受限于成本，而且融合算法也不够成熟。为此，本文以智能汽车为研究对象，围绕危险目标检测所需的两个主要技术：目标检测和目标距离检测，做了深入的研究，提出基于低成本的视觉传感器的目标检测算法和距离检测算法：基于多任务学习的多目标检测算法和基于笛卡尔积的多目标检测算法来检测车辆前方的障碍物；针对障碍物距离检测问题，提出两种基于视觉的障碍物距离检测的方法，融合检测信息分析出危险的目标，为汽车的辅助驾驶和自动驾驶提供技术支撑。本文的主要章节包含以下工作和贡献： 1. 针对车型分类的问题，提出了一种融合注意力的深度强化学习方法。卷积神经网络（Convolutional Neural Network，CNN）架构非常适合处理图像数据，为识别图像中前方车辆提供了切实可行的解决方案。然而对于车辆来说，不同类别物体之间只存在有细微局部的差别。将整幅图像作为输入，环境噪声所引起的变化通常更加显著。为了消除环境噪声影响，使细微局部特征更显著。我们将人观察物体的注意力机制引入到传统的CNN网络，通过制定相应的规则来寻找具有判别性的关键区域，削弱图像中背景信息对神经网络的影响，增强与目标有关的区域对任务的影响，进而增强CNN的提取特征的能力，达到更好的分类效果。同时本文采用深度强化学习的方法，处理从一个关键区域转移到另一个关键区域的决策问题，构建深度神经网络，利用当前状态的识别结果的信息熵大小作为奖惩值，训练网络使其能够自动寻找出对分类有帮助的关键区域，进一步减少冗余信息，提高分类的鲁棒性。 2. 针对复杂交通场景下目标检测问题，提出了一种融合目标距离类别和目标类别的多任务目标检测的方法，实现检测障碍物，并且对障碍物的距离进行粗分类，简单判断目标的危险性。单一任务学习是忽略任务之间可能存在的关系，而多任务学习则看重任务之间的联系，通过联合学习，不同任务之间共享相同的参数，挖掘不同任务之间隐藏的共有数据特征。在规则化的道路中，前方行驶的车辆的距离和车辆本身的属性有很强的关系，比如垂直距离影响车辆的清晰度和大小，水平距离影响车辆的姿态。我们考虑目标的距离和目标检测的相关性，采用多任务学习的方法，构建深度神经网络，同时实现目标定位，目标分类和目标距离检测。多任务学习的方法不仅仅通过增加目标距离检测任务来提高目标检测的精度，同时预测出的目标距离是智能驾驶的环境感知中的一个非常重要的信息。本文通过检测的距离来评判目标的危险等级，给驾驶员提醒。 3. 考虑到目标类别和目标距离的相关性，提出一种新的基于笛卡尔积组合的目标检测和目标距离检测的多任务学习方法。在理论上证明了传统的线性组合的多任务学习方法的损失函数是基于笛卡尔积的多任务学习方法的损失函数的一种特例。当几个任务相互独立的时候两个损失函数相等，因此基于笛卡尔积的多任务学习的损失函数更加全面的考虑任务间的相关性。将目标识别和距离检测通过笛卡尔积组合，共享两个检测任务各自的抽象特征，一个网络完成两个任务，减少计算量，提高模型泛化能力和检测精度。 4. 在智能驾驶中，目标距离信息是智能车辆感知前方障碍物的一个关键信息，也是判断目标危险性的必要信息。本文提出基于单目图像检测前方障碍物的距离，采用两种方法检测前方障碍物距离，一种是采用几何的方法，即已知相机的内外参，根据目标障碍物和地面的交线与本车的几何关系以及图像坐标、相机坐标和世界坐标的转换关系得到目标的距离，这种方法简单、计算快速，适用于规则化道路中的车辆距离检测。另一种方法采用神经网络检测前方障碍物距离，这种方法减小了远距离目标检测不准确而带来的误差。 5.本文针对交通场景中运动的目标检测和目标距离检测的问题，提出了一种考虑目标运动特性，即位置、距离和目标大小的追踪算法，此算法可以进一步修正目标检测中的误检和漏检，提高距离检测的准确率。本文根据检测到的目标检测框，目标距离，目标大小三者构建神经网络，实现多目标追踪，同时获得每一时刻目标的距离信息。 ; Recently, autonomous driving has been extensively studied and has shown considerable promise. Advanced Driver Assistance Systems (ADAS) can warn the driver in dangerous situations to reduce the traffic accidents. In some scenes, it also can replace the driver to control the car. Environmental perception is a principal problem of autonomous driving. It is difficult to accurately detect the surrounding vehicles, pedestrians, and other targets and find dangerous situations in the real-world transportation system. This thesis studies on the dangerous object detection problems in autonomous driving. Dangerous object detection aims to identify the potentially dangerous objects for drivers. Common dangerous object detection methods generally use a variety of sensors, such as radars, lasers, and sonars to detect surrounding obstacles. However, laser and radar sensors are too expensive to realize large-scale applications, and they are under a limited capacity to recognize object categories. Therefore, visual information is essential for practical autonomous driving systems. We focus on the two main visual-based technologies in the dangerous object detection: object detection and distance prediction. We propose a Cartesian product-based multi-task learning to simultaneously detect objects and get their distances. We also introduce two methods for getting object distances: a geometry-based method and a neural network method. The object danger level is given according to the predicted object distance. This thesis contains the following work and contributions: 1. We propose a visual attention-based deep reinforcement learning method for vehicle classification. As is well known, convolutional neural network (CNN) has the capacity to extract features from images, providing a practical solution for vehicle classification. However, there is an only slight difference between different vehicle categories. Most regions of an image are considered as environmental noise, which is useless or even disturbed. To eliminate environmental noise effects and highlight the local useful regions, we introduce the human attention mechanism into the traditional CNN network. It finds discriminative key areas and weakens the influence of background in images, enhancing the impact of key areas by designing rules. At the same time, we propose a deep reinforcement learning (DRL) method to automatically find the key areas, namely moving from one key area to another area. The DRL method can automatically find out the key areas, further reduce the redundant information and improve the robustness of the classification. 2. A multi-task learning method is proposed to jointly model object detection and distance prediction for detecting dangerous objects. The single task learning method ignores the relationship between the object detection and the distance acquisition. While multi-task learning method can share the same parameters between different tasks, extracting hidden features. In the regularized road, the distance of a vehicle has a strong relationship with the properties of the vehicle. Considering the relationship between object distance and detection, we adopt the multi-task learning method to design a deep neural network, simultaneously achieving object positions, object categories, and object distances. The achieved distance is utilized to judge the object's danger level and alert the driver. 3. Considering the relationship between object categories and distances, a new multi-task learning method based on Cartesian product combination for object detection and object distance prediction is proposed. We also theoretically prove that the linear multi-task combination is equivalent to the Cartesian product-based multi-task combination if the two tasks are independent. Otherwise, the proposed the Cartesian product-based multi-task combination outperforms the linear multi-task combination since the proposed method takes the correlation between multiple tasks into account. 4. In autonomous driving, the target distance is a key information to understand the environment, and it is also indispensable for judging the danger of the target. We use two methods to get the distance of objects based on monocular images. One is based on geometry, which is simple and fast. The other is the neural network-based method, which can directly achieve the distance of objects from images. 5. Aiming at the detection of moving targets in complex scenes and the problem of distance prediction, we propose a method for tracking vehicles. The tracking method takes into consideration the characteristics of objects (position, distance, and size). This method can further mend the wrong objects and missing objects in the detection and improve the accuracy of distance prediction.
关键词	深度学习智能驾驶目标检测多任务学习视觉注意力
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/21021
专题	毕业生_博士学位论文
作者单位	1.中国科学院自动化研究所 2.中国科学院大学
第一作者单位	中国科学院自动化研究所
推荐引用方式 GB/T 7714	陈亚冉. 基于深度学习的智能驾驶危险目标检测[D]. 北京. 中国科学院研究生院,2018.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
Thesis-final_IR.pdf（11155KB）	学位论文		限制开放	CC BY-NC-SA