|Place of Conferral||中国科学院自动化研究所|
|Keyword||深度学习 语义建图 深度图 超分辨率 残差网络|
1. 针对机器人室内语义建图的实时性和准确性问题，提出了基于三维点云深度神经网络与实时三维重建系统相结合的室内语义地图构建方法，通过使用PointNet++ 对ElasticFusion 在实时三维重建过程中根据RGB-D 图像生成的点云做语义分割，并使用贝叶斯更新方法根据相机位姿计算点云位置更新室内场景全局语义地图，实现了对室内场景的语义建图，突破了基于图像语义分割方法的传统语义建图形式，实现了三维场景点云构建与语义分割同步生成，在几何结构较为明显的语义类上达到了60%~70% 的像素级分类正确率，在某些语义类上相较于传统方法将像素级分类正确率提高了约5% 以上。
2. 针对深度图超分辨率的细节模糊问题，提出了一种基于双分支残差网络的深度图超分辨率重建技术，通过平行的残差块、组、层的嵌套结构对彩色图像和深度图进行多尺度的通道特征提取、交互和上采样，实现了端到端地生成高分辨率深度图，突破了高分辨率彩色图像指导低分辨率深度图上采样时通道特征融合简单且向深度图引入伪影的问题。在数据集Middlebury 上的测试结果表明本文提出的方法相较于传统方法在各个采样因子下平均均方根误差减少约20%。
With more and more extensive applications of the robot in production and life, people have higher requirements for the interaction ability between robot and environment. It is hoped that the robot can understand and execute the natural language instructions from human beings, and can realize human-computer interaction tasks such as taking and putting objects, answering environmental information, etc. To achieve these goals, the robot needs to understand the three-dimensional (3D) scene information of its environment, which means marking the categories of objects in the model while completing the 3D reconstruction of the environment, so as to build a map containing semantic information for the robot to query and retrieve. The construction of semantic map is inseparable from the indoor depth map provided by the depth camera, so the accurate acquisition of depth information as the basis of 3D technology becomes very important. At present, the depth map can be obtained easily and cheaply by the low-cost depth camera. However, the resolution of the depth map obtained under this hardware condition is usually low. The low-resolution depth map needs to be transformed into high-resolution depth map through super-resolution processing before it can be used in 3D technology. In this paper, the deep learning methods are employed to explore and study robotic indoor semantic mapping and depth map super resolution. Specifically, the main works of this paper include:
1. Aiming at the problems of real-time and accuracy of robotic indoor semantic mapping, a method of indoor semantic mapping based on the combination of the deep neural network of 3D point cloud and real-time 3D reconstruction system is proposed. This method uses PointNet++ to semantically segment the point cloud generated by the RGB-D image during the real-time 3D reconstruction process of ElasticFusion, and uses the Bayesian updating method to calculate the position of the point cloud by the camera pose and update the global semantic map of the indoor scene. This method realizes the semantic mapping of the indoor scene, breaks through the traditional semantic mapping form based on the image semantic segmentation method, realizes the synchronous generation of 3D scene point cloud and semantic segmentation, achieves the pixel-level classification accuracy of 60%~70% on some semantic classes with obvious geometric structures, and improves the pixel-level classification accuracy by more than 5% on several semantic classes compared with the traditional methods.
2. Aiming at the problem of the blur of details in depth map super resolution, a novel depth map super-resolution reconstruction technology based on dual-branch residual network is proposed. This technology realizes the generation of the high-resolution depth map end to end through the multi-scale feature extraction, interaction and upsampling of depth map and color image with the parallel nested structure of residual blocks, groups and levels, and breaks through the problem that channel-wise feature fusion is simple and the artifacts are introduced to the depth map when the high-resolution color image guides the upsampling of the low-resolution depth map. The verification on dataset Middlebury shows that the average root mean square error of this technique is reduced by about 20% compared with the traditional methods under each sampling factor.
|陈睿进. 基于深度学习的室内场景语义建图与超分辨率技术研究[D]. 中国科学院自动化研究所. 中国科学院大学,2020.|
|Files in This Item:|
|基于深度学习的室内场景语义建图与超分辨率（5489KB）||学位论文||限制开放||CC BY-NC-SA||Application Full Text|
|Recommend this item|
|Export to Endnote|
|Similar articles in Google Scholar|
|Similar articles in Baidu academic|
|Similar articles in Bing Scholar|
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.