Knowledge Commons of Institute of Automation,CAS
基于深度学习的视觉里程计与视觉定位技术研究 | |
万一鸣 | |
2020-05-26 | |
页数 | 80 |
学位类型 | 硕士 |
中文摘要 | 相机位姿估计是移动机器人、自主导航以及增强现实中的重要环节。位姿估 |
英文摘要 | Camera pose estimation is essential for robotics, auto-driving and augment reality.Pose estimation can be generally divided into two kinds: absolute pose estimationand relative pose estimation. Given an RGB image, absolute pose estimation is to calculate camera poses under the global coordinate system. This is often called visual localization. Relative pose estimation is about estimating the pose between two consecutive For visual odometry, aiming at the poor performance of the generalization of current deep neural networks, a novel odometry model based on multi-task learning is proposed. This model learns to estimate relative poses with optical flow prediction as the auxiliary task. Learning the two tasks simultaneously can force the network to explore the inner-relationship between the two tasks and help the network learn better motion features. The risk of over-fitting is therefore alleviated. Experiment results indicate the proposed method can effectively improve the ability of generalization. Visual odometry is easily influenced by dynamic objects in the scene. To solve this problem, a novel odometry model which can perceive dynamic objects is proposed. This model estimates masks of dynamic objects via epipolar constraint and reduces weights of photometric error for such areas. In order to solve the problem that the output of the recurrent neural network is too smooth, a module named LCGR(Local Convolution and Global RNN) is proposed to enhance local information of image sequence and capture global information. Experiments demonstrate the proposed method can improve the accuracy of relative pose estimation and make the network more robust to scenes which contain a lot of dynamic objects. For visual localization, the network is easily over-fitting because of the sparsity of training data. To solve the problem, a geometric data augmentation method is proposed. The proposed method first predicts the depth maps for input images in a semi-supervised way and randomly synthesizes new views using the predicted depth maps. The training
|
关键词 | 请输入关键词 |
语种 | 中文 |
七大方向——子方向分类 | 三维视觉 |
文献类型 | 学位论文 |
条目标识符 | http://ir.ia.ac.cn/handle/173211/39139 |
专题 | 多模态人工智能系统全国重点实验室_机器人视觉 |
推荐引用方式 GB/T 7714 | 万一鸣. 基于深度学习的视觉里程计与视觉定位技术研究[D]. 中国科学院自动化研究所. 中国科学院大学,2020. |
条目包含的文件 | ||||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 | ||
基于深度学习的视觉里程计与视觉定位技术研(11938KB) | 学位论文 | 开放获取 | CC BY-NC-SA |
个性服务 |
推荐该条目 |
保存到收藏夹 |
查看访问统计 |
导出为Endnote文件 |
谷歌学术 |
谷歌学术中相似的文章 |
[万一鸣]的文章 |
百度学术 |
百度学术中相似的文章 |
[万一鸣]的文章 |
必应学术 |
必应学术中相似的文章 |
[万一鸣]的文章 |
相关权益政策 |
暂无数据 |
收藏/分享 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论