视觉在线实时定位与建图的挑战性问题研究

CASIA OpenIR > 毕业生 > 博士学位论文

	视觉在线实时定位与建图的挑战性问题研究
	唐付林
	2020-05-27
页数	120
学位类型	博士
中文摘要	在三维计算机视觉中，视觉在线实时定位与建图是一个非常重要的研究方向，其任务是通过视觉传感器构建环境的三维模型并确定视觉传感器自身在环境中的空间位置。由于视觉在线实时定位与建图在机器人定位与导航，增强现实以及自动驾驶等领域有着广泛的应用，所以受到了研究人员高度的关注。视觉在线实时定位与建图有很多研究任务，包括同步定位与地图构建（simultaneouslocalization and mapping，SLAM）、动态物体的视觉在线实时定位与建模、基于标记物的视觉在线实时6 自由度相机位姿跟踪等。目前在这些研究任务中仍然存在一系列挑战性难题，例如同步定位与地图构建无法同时兼顾精度和速度、目标与相机皆移动无法较好地实现在线实时定位与建模、现存的标记物以及在线实时相机定位方法对图像噪声、图像模糊以及相机远离标记物不鲁棒和定位精度差。为了解决这些挑战性的难题，本文进行了深入研究，主要贡献如下： • 提出了一种特征法与直接法相融合的立体视觉SLAM 新框架，可以达到较高的精度同时具有较快的速度。在前端，利用直接法和匀速运动模型预测一个鲁棒的初始相机位姿，然后进一步利用直接法投影局部地图获取3D-2D 对应点，最后根据得到的3D-2D 对应点，利用重投影误差最小化优化相机位姿。前端使提出的框架速度更快。在后端，采用运动恢复结构（structure from motion，SFM）计算三维地图。当一个新的关键帧插入时，使用三角化生成新的地图点。为了提高提出的框架的精度，使用捆绑调整优化全局地图，尤其在捆绑调整中还使用了双目约束。后端使提出的框架精度更高。实验结果表明，与国际上最流行的方法ORBSLAM2 和SVO 相比，提出的框架不仅在精度上得到了提升而且在仅使用CPU 的条件下可以达到平均100FPS 以上的速度。 • 提出了一种面向动态圆柱体的视觉在线实时定位与建模的方法。首先，拍摄包含圆柱体物体的图像，并且利用圆柱体物体在图像中的轮廓和其射影不变性来重建圆柱体物体的三维模型。其次，根据重建的圆柱体物体的三维模型，在线实时跟踪相机与圆柱体物体之间的6 自由度相对位姿。在跟踪的过程中，提出了一种线性P3P RANSAC 方法用来剔除外点。最后，为了验证在线实时计算得到的6 自由度相机位姿的准确性，通过使用计算得到的6 自由度相机位姿，将虚拟的计算机三维模型投影到真实世界中并使其与真实世界中的圆柱体物体对齐，达到增强现实的目的。实验结果表明，与国际上最前沿的方法相比，提出的方法的精度和速度都得到了提升，增强现实的效果代表了该领域的先进水平。 • 设计了一类圆形标记物并且基于其提出了一种在线实时6 自由度相机位姿跟踪的方法。设计了一类圆形标记物，并且基于设计的圆形标记物，通过使用射影不变性，解析地表达6 自由度相机位姿为非常精简的形式。随后，提出了一种点到二次曲线的捆绑调整方法用来进一步优化解析表达出的6 自由度相机位姿，其中采用的优化目标函数是建立在一种点到二次曲线的几何距离上。由于提出的方法使用图像上圆的边缘和无需n 点透视（Perspective n Points，PnP）算法，所以提出的6 自由度相机位姿跟踪方法对图像噪声、图像模糊以及相机远离标记物具有更高鲁棒性和精确性。实验结果表明，提出的方法的定位精度超越了国际上相关领域最流行的方法（ARToolkitPlus，AprilTag2 和RUNETag），并且在CPU 下达到了平均100FPS 的定位速度。
英文摘要	In 3D computer vision, online real-time visual localization and mapping is important, whose task is to compute a 3D map of environment and spatial positions of visual sensors. As it has wide applications in robot localization and navigation, augmented reality and autonomous driving, it has attracted great attention from researchers. This research direction includes studies on simultaneous localization and mapping(SLAM), online real-time localization and modeling of dynamic objects, and online real-time 6-DOF camera pose tracking from markers. At present, there are still some challenging problems in this direction. For example, the existing SLAM methods cannot achieve high accuracies and fast speed at the same time, the existing monocular SLAM cannot perform well when both camera and objects are moving, and 6-DOF camera pose tracking from markers are not robust and accurate to noise, blur, and long distance. This thesis studies these problems and the main contributions are as follows. • A novel stereo visual SLAM framework considering both accuracy and speed at the same time is proposed. The framework makes full use of the advantages of key-feature-based multiple view geometry (MVG) and direct-based formulation. At the front-end, the system performs direct formulation and constant motion model to predict a robust initial pose, reprojects local map to find 3D-2D correspondences by direct formulation and finally refines pose by the reprojection error minimization. This frontend process makes the system faster. At the back-end, structure from motiom(SFM) is used to estimate 3D structure. When a new keyframe is inserted, new mappoints are generated by triangulating. In order to improve the accuracy of the proposed system, a global map is kept by bundle adjustment. Especially, the stereo constraint is performed to optimize the map. This back-end process makes the system more accurate. Experimental results show that compared with the most popular methods ORBSLAM2 and SVO , the proposed framework outperforms them in terms of accuracy and can run at more than 100FPS under CPU. • An online real-time visual localization and modeling method for dynamic cylinders is proposed. First, images containing a cylindrical object are captured, and the 3D model of the cylindrical object is reconstructed by contour of the cylindrical object and its projective invariance in the images. Secondly, according to the reconstructed 3D model of the cylindrical object, relative 6-DOF camera poses between the camera and the cylindrical object can be tracked online. In the tracking, a linear P3P RANSAC method is proposed to remove outliers. Finally, in order to verify the accuracy of 6-DOF camera poses calculated online in real time, by using the calculated 6-DOF camera poses, a virtual computer 3D model can be projected into the real world and aligned with a cylindrical object in the real world to achieve augmented reality(AR). Experimental results show that compared with the state of the arts, the proposed method achieves higher accuracy and fast speed, and the effect of augmented reality represents the advanced level in the field. • A class of circular markers is designed and an online real-time 6-DOF camera pose tracking method from them is proposed. We design a class of circular markers. Based on the designed circular markers, the 6-DOF camera pose is analytically expressed as a very precise form by using projective invariance. Afterwards, the pose is further optimized by a novel point-conic bundle adjustment based on a polar-n-direction geometric distance. The proposed method is from imaged circle edges and without PnP, which makes camera pose tracking robust and accurate in terms of image noise, image blur, and distance of camera to the marker. Experimental results show that the localization accuracy of the proposed method outperforms the most popular methods, such as ARToolkitPlus, AprilTag2 and RUNETag, and the localization speed reaches 100FPS under CPU.
关键词	在线实时定位与建图同步定位与地图构建圆柱体位姿跟踪圆形标记物 6自由度相机位姿
语种	中文
七大方向——子方向分类	三维视觉
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/39211
专题	毕业生_博士学位论文
通讯作者	唐付林
推荐引用方式 GB/T 7714	唐付林. 视觉在线实时定位与建图的挑战性问题研究[D]. 中国科学院自动化研究所. 中国科学院大学,2020.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
论文-完整版-签名.pdf（12829KB）	学位论文		限制开放	CC BY-NC-SA