三维点云语义分割方法研究

CASIA OpenIR > 毕业生 > 博士学位论文

	三维点云语义分割方法研究
	邓爽
	2022-08
页数	146
学位类型	博士
中文摘要	三维点云语义分割是计算机视觉领域中一个重要的研究主题，旨在对三维点云中的每个点赋予一个语义类别标签。近年来，随着三维数据制备技术的不断发展，三维点云语义分割在自动驾驶、机器人、遥感和医疗等诸多领域展现出巨大的应用潜力。本文针对三维点云语义分割中存在的一些问题进行了探索，主要工作如下： 1、针对点云分割研究中分割准确率易受物体姿态影响的问题，提出了一种点云自适应旋转变换网络，旨在将输入的具有不同姿态物体的三维点云变换为具有同一标准姿态物体的三维点云，从而实现三维点云的空间不变表达。该网络首先利用Z-Y-Z欧拉角表示三维旋转，并将Z-Y-Z欧拉角离散化以实现三维旋转的分类识别。然后，该网络采用一种新型的双分支结构，一个分支用于提取全局特征，另一个用于提取局部特征，通过融合全局特征与局部特征来学习输入点云与标准姿态物体点云之间的三维空间变换，从而将任意姿态物体的三维点云变换到标准姿态下。实验结果表明，当将该点云自适应旋转变换网络与若干主流的点云分割方法联合使用时，将显著提升这些方法的分割性能。 2、针对点云分割研究中提取全局特征计算量大的问题，提出了一种全局注意力点云语义分割网络。该网络由一个与点无关的全局注意力模块和一个与点相关的全局注意力模块组成，以高效地获取三维点云的全局上下文信息。具体而言，与点无关的全局注意力模块只对所有三维点共享一个全局注意力图。与点相关的全局注意力模块包含点云重排序、两个随机交叉注意力计算和点自适应特征聚合三个主要环节。随机交叉注意力计算只利用两个点云子集便可以让每个三维点学习到全局注意力特征。实验结果表明，在不引入太大计算量的情况下，该全局注意力点云语义分割网络优于文献中若干主流点云语义分割方法。 3、针对点云分割研究中数据标注困难的问题，提出了一种基于超点的点云半监督语义分割网络。首先，该网络引入了一个超点生成模块将基于几何和基于颜色的区域生长算法进行结合来生成点云的超点。然后，引入了一个伪标签优化模块来修改和剔除超点内置信度不高的伪标签。进一步地，引入了一个边缘预测模块来约束边缘点的特征，以及一个超点特征聚合模块和超点特征一致性损失函数来平滑超点特征。实验结果表明，当训练数据集中只含有少量有标签点云时，该方法的分割性能优于文献中六种主流方法。 4、为了进一步提升点云半监督语义分割性能，提出了一种基于图像引导的点云半监督语义分割网络。该网络由多个双模型对齐模块、一个一致性损失函数和一个伪标签融合模块组成。双模型对齐模块利用多头注意力机制对三维和二维中间层特征进行融合。一致性损失函数对三维和二维输出特征进行进一步约束。伪标签融合模块对点云和图像数据预测的伪标签相互进行优化，以提高伪标签的置信度。实验结果表明，相比于前述基于超点的点云半监督语义分割网络以及文献中七种主流方法，该方法取得了更好的分割性能。
英文摘要	3D point cloud semantic segmentation is an important research topic in computer vision, which aims to assign a semantic class label to each point in 3D point clouds. In recent years, with the development of 3D data preparation technology, 3D point cloud semantic segmentation has shown great application potential in many fields such as autonomous driving, robotics, remote sensing, and medical treatment. This thesis investigates several problems in 3D point cloud semantic segmentation, and the main works include: 1. Aiming at the problem that segmentation accuracy is easily affected by the poses of objects in the research of point cloud segmentation, an adaptive rotation transformation network of point clouds is proposed, which aims to transform the input 3D point clouds of objects with different poses to 3D point clouds of objects with standard poses, to achieve the spatial-invariant representations of 3D point clouds. This network firstly uses Z-Y-Z Euler angles to represent 3D rotations, and discretizes Z-Y-Z Euler angles to achieve the classification of 3D rotations. Then, this network adopts a novel two-branch structure, in which one branch is used to extract global features, the other is used to extract local features, and learns the 3D space transformations between input point clouds and point clouds of objects with standard poses by fusing the global and local features, to transform the 3D point clouds of objects with arbitrary poses to the standard poses. Experimental results show that when the adaptive rotation transformation network of point clouds is used jointly with several typical point cloud segmentation methods, the performances of these methods would be significantly improved. 2. Aiming at the problem that global feature extraction is computationally expensive in the research of point cloud segmentation, a global attention network for point cloud semantic segmentation is proposed. This network consists of a point-independent global attention module and a point-dependent global attention module to efficiently obtain global contextual information of 3D point clouds. Specifically, the point-independent global attention module only shares a global attention map for all 3D points. There are three main steps in the point-dependent global attention module, including reordering point clouds, two random cross-attention computations, and point-adaptive feature aggregation. The random cross-attention computation only uses two subsets of point clouds to learn global attention features for each 3D point. Experimental results show that the global attention network for point cloud semantic segmentation outperforms several typical point cloud semantic segmentation methods without introducing too much computation. 3. Aiming at the data annotation problem in the research of point cloud segmentation, a superpoint-guided semi-supervised semantic segmentation network of point clouds is proposed. Firstly, this network introduces a superpoint generation module that combines geometry-based and color-based region growing algorithms to produce superpoints of point clouds. Then, a pseudo-label optimization module is introduced to modify and delete pseudo-labels with low confidence in superpoints. Further, an edge prediction module is introduced to constrain the features of edge points, and a superpoint feature aggregation module and superpoint feature consistency loss function are introduced to smooth superpoint features. Experimental results show that this method performs better than six typical methods with few labeled point clouds in training sets. 4. To further improve the performances of point cloud semi-supervised semantic segmentation, an image-guided semi-supervised semantic segmentation network of point clouds is proposed. This network consists of sereval dual-model alignment modules, a consistency loss function, and a pseudo-label fusion module. The dual-model alignment module uses the multi-head attention mechanism to fuse 3D and 2D hidden layer features. The consistency loss function further constrains 3D and 2D output features. The pseudo-label fusion module optimizes the pseudo-labels predicted by point clouds and images to improve the confidence. Experimental results show that this method achieves a better segmentation performance than the aforementioned superpoint-guided semi-supervised semantic segmentation network of point clouds and seven typical methods.
关键词	三维点云语义分割旋转不变性全局注意力半监督分割
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/49694
专题	毕业生_博士学位论文
通讯作者	邓爽
推荐引用方式 GB/T 7714	邓爽. 三维点云语义分割方法研究[D]. 北京市海淀区中关村东路95号. 中国科学院自动化研究所,2022.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
三维点云语义分割方法研究.pdf（20341KB）	学位论文		限制开放	CC BY-NC-SA