面向智能车环境感知的语义分割及其应用研究

CASIA OpenIR > 毕业生 > 硕士学位论文

	面向智能车环境感知的语义分割及其应用研究
	范嗣祺
	2022-05-17
页数	92
学位类型	硕士
中文摘要	随着现代社会的发展，交通环境愈发地复杂多变，全球交通事故发生率居高不下，其中绝大部分是由驾驶员人为因素造成的。智能车可以减少人为事故的发生，有效提升道路交通安全，环境感知是其亟待突破的关键技术之一。本文面向智能车周围环境感知，针对二维视觉图像和三维点云两类数据，以由区域级到像素级逐步细致化、自二维至三维逐步丰富化的思路，从道路目标感知、二维环境感知和三维环境感知三个具体任务展开，挖掘传感器数据中的语义信息，发展出了三个性能良好的环境感知方法。本文的主要研究工作如下：（1）面向智能车道路目标感知，本文针对无锚框道路目标检测提出了一种前景区域分割方法，以较小的额外计算开销减少复杂交通环境下的杂乱背景信息干扰，进而发展出了一种基于前景注意力机制的道路目标检测网络FII-CenterNet；为了在不引入额外标注成本的情况下实现监督训练，利用边界框标签生成了前景区域分割标签，设计了一种多任务损失函数；实验结果表明，该方法可以有效分割前景区域，缓解杂乱背景的干扰问题，发展出的FII-CenterNet兼具较好的目标感知效果和效率。（2）面向智能车二维环境感知，本文针对模型训练像素级标注成本巨大的问题，提出了一种基于保守-激进协同学习的图像半监督语义分割方法。受“求同存异”的思想为启发，提出了保守-激进协同学习方法CPCL，实现了“保守进化”和“激进探索”之间的协同；提出了基于类别分歧度指标的伪标签确定方法，以相对宏观的角度确定分歧部分的伪标签，而不是仅着眼于单个像素；针对难以避免的噪声伪标签，进一步设计了基于预测置信度的自适应动态损失函数，自适应且充分地利用伪标签；实验结果表明，该方法可以高效挖掘无标签数据，仅使用少量有标签数据即可达到较优的语义分割效果，在保证效果的同时，有效降低了智能车二维环境感知模型的训练成本。（3）面向智能车三维环境感知，本文针对三维点云数据空间位置信息丰富的特点，提出了一种基于空间上下文特征学习的大规模点云语义分割方法。具体地，提出了一种空间上下文特征的系统化学习方法，包括基于局部方向的局部空间上下文信息表示方法、基于双距离注意力机制的局部空间上下文特征学习方法和基于相对体积比的全局空间上下文特征学习方法，并实现了相应的空间上下文特征学习模块，进而发展出了一种大规模点云分割网络SCF-Net；实验结果表明，该方法可以有效提升智能车三维环境感知性能，同时也可以扩展应用于移动机器人的室内环境感知。面向智能车环境感知对语义分割及其应用展开研究，可以强化智能车对其周围环境的理解能力，为后续决策规划系统提供更详细可靠的信息输入，进而提升智能车安全性，对智能车技术和产业的发展具有重要意义。
英文摘要	With the development of modern society, the traffic environment is becoming increasingly complex. The global traffic accident rate remains high, and most of which are caused by the human factors of drivers. Intelligent vehicles (IVs) can reduce the man-caused accidents and effectively improve the traffic safety. The environment perception of IVs is one of the key technical problems to be solved. Leveraging the 2D images and 3D point clouds, this paper studies the environment perception of IVs step by step (from region-level to pixel-level and from 2D to 3D). Specifically, it focuses on three tasks: traffic objects perception, 2D environment perception, and 3D environment perception. Three well-performed methods are proposed in this paper, which fully exploit the semantic information. The main research works are listed as follow: (1) Focusing on the traffic object perception, a foreground segmentation approach for anchor-free object detection is proposed, which could alleviate the background influences under the complex traffic environment with little additional computation cost. On the basis of that, a novel traffic object detection network with foreground attention, called FII-CenterNet, is developed. To realize supervised learning without additional labeling cost, the foreground segmentation labels are generated based on the input bounding-box labels, and a multi-tasks loss function is designed. Extensive experimental results show that the proposed method can effectively segment the foreground and alleviate the background interference for the traffic object detection. Benefit from that, FII-CenterNet achieves good traffic object perception performance in both accuracy and efficiency. (2) Focusing on the 2D environment perception, a semi-supervised semantic segmentation approach based on the conservative-progressive collaborative learning (CPCL) is proposed, which could reduce the huge cost of pixel-level labeling. Inspired by the idea of “seeking common ground while reserving differences”, the CPCL is proposed, and achieves the collaboration of conservative evolution and progressive exploration. To generate the pseudo labels for the disagreement region, a pseudo labeling method based on the class-wise disagreement indicator is proposed, which is from a macro point of view instead of focusing on the exact pixel. Besides, an adaptive dynamic loss function based on the predictive confidence is designed to deal with the noisy pseudo labels. Extensive experimental results show that the proposed method can effectively mine the unlabeled data, and perform well with only few labeled data. Benefit from that, the training cost of the model for the 2D environment perception can be effectively reduced without performance degradation. (3) Focusing on the 3D environment perception, a large-scale point cloud segmentation approach based on the spatial contextual features learning is proposed, which could make full use of the rich geometric information provided by the point clouds. Specifically, a systematic method for learning the spatial contextual feature is proposed, including: the local spatial contextual information representation method based on the local direction, the local spatial contextual feature learning method based on the dual-distance, and the global spatial contextual feature learning method based on the relative volume ratio. On the basis of that, a corresponding module for spatial contextual learning is designed, and then a large-scale point clouds segmentation network, called SCF-Net, is developed. Extensive experimental results show that the proposed method can improve the 3D environment perception performance for the intelligent vehicle. Additionally, it can also be extended to the indoor environment perception with good performance for the mobile robot. The research on semantic segmentation for environment perception of intelligent vehicles can make it understand the surroundings better, and provide more detailed and reliable information for the decision-planning system, which can improve the safety of IVs and further promote the development of both the technology and the industry.
关键词	图像语义分割点云语义分割半监督学习环境感知智能车
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/48715
专题	毕业生_硕士学位论文
推荐引用方式 GB/T 7714	范嗣祺. 面向智能车环境感知的语义分割及其应用研究[D]. 中国科学院自动化研究所. 中国科学院自动化研究所,2022.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
面向智能车环境感知的语义分割及其应用研究（8690KB）	学位论文		限制开放	CC BY-NC-SA