CASIA OpenIR  > 毕业生  > 硕士学位论文
基于神经辐射场的室内三维重建方法研究
刘湘龙
2024-05-15
页数66
学位类型硕士
中文摘要

近年来,随着计算机视觉技术和图形学领域的迅猛发展,室内三维重建技术成为研究热点之一。室内三维重建通过二维图像或点云数据还原室内环境的三维结构,在虚拟现实、智能家居、文物保护、室内导航等领域发挥着重要作用。传统的基于几何三维重建方法使用多视图特征匹配进行特征的检测与匹配,然后通过运动恢复结构(Structure from Motion, SfM)的方法估计相机位姿,在将运动恢复结构获得点云稠密化后进行表面重建。该类方法的位姿估计可能会因为特征点匹配不可靠导致重建失败,重建出的网格也会出现产生空洞与大量异常点。

最近提出的基于神经辐射场方法的重建方法可以产生光滑,具有水密性的物体表面。但在对室内重建时由于形状辐射歧义性的存在,在对室内场景进行重建时无法重建真实的表面形状。而且由于室内场景存在大面积的重复或纯色区域,传统的运动恢复结构生成的位姿可能并不准确,影响室内场景重建的质量。

本文探究了在使用神经辐射场对室内场景进行重建的过程中,提高室内表面重建的质量和在不准确位姿下对室内场景进行重建的问题,主要创新成果如下:

(1)本文提出了一种使用几何先验的基于多分辨率体素网格的重建方法。本文使用单目几何深度估计模型预测的深度信息与法向信息作为几何先验用于监督室内场景的重建,消除形状辐射歧义性。同时,在重建网络方面,本文改进了多分辨率体素网格结构,采用了由粗到细的体素网格优化策略以及跨体素层梯度求解方式。实验表明,该方法显著提高了室内场景表面重建的质量与视角合成的效果。

(2)本文提出了一种使用相机位姿残差场优化相机位姿提高室内场景重建质量的方法。本文使用相机位姿残差场学习相机位姿的残差用于优化相机位姿。除此之外,本文将目标图片与相邻图片间的特征点进行匹配,计算其对应像素点之间的投影光线距离损失用于监督相机位姿的优化。实验表明该方法能有效提升室内相机位姿估计的准确度,提高了室内场景重建的精度。

英文摘要

In recent years, with the rapid development of computer vision and computer graphics, indoor 3D reconstruction technology has become one of the research hotspots. Indoor 3D reconstruction reconstructs the three-dimensional structure of indoor environments through two-dimensional images or point cloud data, playing an important role in virtual reality, intelligent home, cultural heritage preservation, indoor navigation and other fields. Traditional geometric-based 3D reconstruction methods use multi-view feature matching for feature detection and matching, then estimate camera poses through Structure from Motion (SfM) to obtain dense point clouds for surface reconstruction. The pose estimation in such methods may fail due to unreliable feature point matching, leading to reconstruction failure. The reconstructed meshes may also have issues such as producing holes and a large number of outliers.

Recently proposed reconstruction methods based on neural radiance fields can generate smooth, watertight object surfaces. However, in indoor reconstruction, the presence of shape-radiance ambiguity in radiance fields prevents the reconstruction of true surface shapes. Moreover, due to large areas of repetition or solid color regions in indoor scenes, the poses obtained from traditional structure from motion methods may not be accurate, affecting the quality of indoor scene reconstruction.


This paper explores the process of using neural radiance fields for indoor scene reconstruction, aiming to enhance the quality of indoor surface reconstruction and address the challenges of reconstructing indoor scenes under inaccurate poses. The main innovative contributions are as follows:

(1)This paper presents a reconstruction method based on multi-resolution voxel grids using geometric priors. We utilize depth and normal priors predicted by a monocular geometric depth estimation model as geometric priors to supervise indoor scene reconstruction, aiming to eliminate shape radiance ambiguity in radiance fields. Regarding the reconstruction network, we improve the multi-resolution voxel grid structure by employing a coarse to fine voxel grid optimization strategy and a cross-voxel layer gradient solving approach. Experimental results demonstrate that these methods significantly enhance the quality of indoor scene surface reconstruction and the effectiveness of viewpoint synthesis.

(2) To address the issue of inaccurate camera pose estimation in indoor scenes, this paper employs a residual neural network to learn residual information for refining initial camera poses. During training, we introduce a projection ray distance loss, which supervises the learning of local information for camera pose estimation by measuring the distance between corresponding pixel projection rays in adjacent images. This method enhances the accuracy of camera pose estimation. The proposed method achieves surface reconstruction of indoor scenes under inaccurate poses through joint optimization of camera poses and neural radiance fields.

关键词Neural implicit representation Indoor reconstruction Geometric priors Multi-resolution voxel grid Pose optimization
学科领域计算机科学技术 ; 计算机应用 ; 计算机图形学
学科门类工学::计算机科学与技术(可授工学、理学学位)
语种中文
是否为代表性论文
七大方向——子方向分类三维视觉
国重实验室规划方向分类多模态协同认知
是否有论文关联数据集需要存交
文献类型学位论文
条目标识符http://ir.ia.ac.cn/handle/173211/56595
专题毕业生_硕士学位论文
推荐引用方式
GB/T 7714
刘湘龙. 基于神经辐射场的室内三维重建方法研究[D],2024.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
刘湘龙毕业论文-答辩后修改版.pdf(8855KB)学位论文 限制开放CC BY-NC-SA
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[刘湘龙]的文章
百度学术
百度学术中相似的文章
[刘湘龙]的文章
必应学术
必应学术中相似的文章
[刘湘龙]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。