CASIA OpenIR  > 模式识别国家重点实验室  > 机器人视觉
移动设备上的室内场景在线三维重建研究
刘养东
Subtype博士
Thesis Advisor胡占义 ; 高伟
2019-05-22
Degree Grantor中国科学院自动化研究所
Place of Conferral中国科学院自动化研究所
Degree Discipline计算机应用技术
Keyword在线三维重建,移动设备,icp 算法,平面先验,动态场景
Abstract

      移动设备上的室内场景在线三维重建是计算机视觉领域一个重要的研究方向。本文针对基于RGB-D 的移动设备上在线三维重建问题进行了系统研究,特别是对相机鲁棒跟踪和室内场景快速三维重建以及系统实现问题进行了深入研究,主要工作和贡献有以下三方面:

      1. 针对基于点到平面ICP 的相机跟踪算法在几何特征比较少的场景中求解不稳定的问题,提出了一种几何稳定的相机跟踪与三维重建方法。主要贡献包括三方面:分析了ICP 算法求解不稳定的原因并提出了基于几何稳定性的三维点采样方法,实现了在几何特征较少的场景中鲁棒的相机跟踪;提出基于ICP 协方差矩阵条件数的ICP 与IMU 数据融合策略,能够有效地提高相机的跟踪精度;基于深度噪声标准差自适应的确定体数据融合的截断区间,使得重建模型可以有效保持几何细节。本文方法在iPad Air 2 上能够达到20Hz 的处理速度。在ICL-NUIM 数据集上本文方法能够取得较高的相机跟踪精度和三维重建精度,特别是在几何特征比较少的序列上,相机跟踪精度比其他主流方法至少提升35%。

       2. 针对现有的三维重建方法在移动设备上占用内存过多的问题,提出基于平面先验的高效利用内存的高精度三维重建方法。主要特点有:提出了基于深度梯度的快速平面检测方法并对深度图平面区域降噪,提高了平面区域的重建精度;提出了基于像素与平面的空间关系的相机跟踪残差自适应加权方法,提高了相机跟踪的鲁棒性;提出了基于平面先验信息自适应的设置截断区间和最远融合距离,有效减少了深度噪声对应的内存占用。本文方法在iPad Air 2 上可以达到8Hz 的处理速度。与其他主流方法相比,本文方法在真实室内场景能够平均节省30% 的内存占用,同时能够得到更高的三维重建精度。

      3. 针对现有的移动设备上在线三维重建系统只能重建静态场景的问题,提出了在移动设备上对动态场景中的静态背景进行在线三维重建的方法和系统。主要贡献有:提出了基于特征点补偿深度差直方图的前背景分割方法,并使用静态背景估计相机姿态,实现了移动设备上高效的前背景分割与鲁棒的相机跟踪;提出了动态前景对应的体素滤除方法,减少了重建模型中由物体运动导致的失真;首次构建了在移动设备上能够处理动态场景的三维重建系统。本文方法在iPad Air 2 上可以达到8Hz 的处理速度,在TUM RGB-D 数据集上,在相机跟踪精度基本相同的情况下,计算效率比其他主流的处理动态场景的方法提升了30% 并能生成更高质量的三维模型。

Other Abstract

    On-line 3D reconstruction of indoor scenes on mobile devices is an important research direction in computer vision. This thesis focuses on RGB-D based indoor 3D reconstruction on mobile devices, in particular, on such key issues as robust camera tracking, fast indoor 3D reconstruction, and system implementation. The main works and contributions are summarized as follows:

    1. As camera tracking based on ICP tracker using point-to-plane distance metric is not robust in scenes with insufficient geometric information, we propose a geometrically stable camera tracking and 3D reconstruction method. The main contributions are three-fold: Firstly based on a thorough analysis on the possible causes of the ICP tracker's instability, a geometric stability based sampling method is proposed such that camera pose can be robustly estimated even in scenes with insufficient geometric information. Secondly, the ICP tracker is adaptively fused with the IMU output based on the condition number of the ICP tracker's covariance matrix, in order to improve the camera tracking accuracy. Thirdly during the volumetric integration of depth images, an adaptive truncation distance, which is dependent on the standard deviation of the depth noise, is introduced to preserve fine details of reconstructed models. Experimental results show that our method can achieve frame rates 20Hz on an Apple iPad Air 2 and 200Hz on an Nvidia GeForce GTX 1060 GPU. Systematic qualitative and quantitative evaluations of tracking and reconstruction show that our method outperforms the current state-of-the-art systems on ICL-NUIM dataset, and notably the tracking accuracy is at least 35% better in scenes lacking sufficient geometric information.

    2. To deal with the large footprint problem commonly encountered in current reconstruction methods on mobile devices, a high-quality and memory-efficient 3D reconstruction method is proposed by fully exploiting plane priors. The proposed method contains the following key features: In order to improve the reconstruction accuracy on planar regions, a depth gradient based method is proposed for efficient planar region detection and de-noising in depth images. Then tracking residuals are adaptively weighted by exploring the spatial relationships between the pixels and the planes to improve the tracking accuracy. Finally, the truncation distance and the largest valid distance are adaptively determined during volumetric integration to reduce the memory occupancy by depth noise. Experimental results show that our method can achieve frame rates 8Hz on an Apple iPad Air 2. Compared with the current state-of-the-art methods, our method can reduce 30% memory occupancy with well-preserved fine details in the reconstructed scenes.

    3. In order to overcome the limit that current on-line 3D reconstruction systems on mobile devices can only work on static scenes, a novel method is proposed to reconstruct static background in a dynamic scene on mobile devices. The main contributions include: At first, a static/dynamic scene segmentation method is proposed based on the histogram of compensated depth difference, and by which only the segmented static pixels are used for camera tracking. The method is shown efficient on mobile devices and able to estimate camera poses robustly in dynamic scenes. Then a voxel-based representation is adopted for integrating depth images and a dynamic voxel removing approach is proposed, such that reconstruction artifacts by dynamic elements are filtered out. To our knowledge, we seem the first in the literature to attempt and successfully implement a voxel based 3D reconstruction of high dynamic scenes on mobile devices. Experimental results show that our method can achieve frame rates 8Hz on an Apple iPad Air 2. On TUM dataset, our method can reduce 30% runtime compared with other state-of-the-art static/dynamic segmentation methods and can reconstruct high quality models under comparable tracking accuracy.

Pages145
Language中文
Document Type学位论文
Identifierhttp://ir.ia.ac.cn/handle/173211/25770
Collection模式识别国家重点实验室_机器人视觉
Recommended Citation
GB/T 7714
刘养东. 移动设备上的室内场景在线三维重建研究[D]. 中国科学院自动化研究所. 中国科学院自动化研究所,2019.
Files in This Item:
File Name/Size DocType Version Access License
Thesis-刘养东-final.pdf(22948KB)学位论文 开放获取CC BY-NC-SAApplication Full Text
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[刘养东]'s Articles
Baidu academic
Similar articles in Baidu academic
[刘养东]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[刘养东]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.