CASIA OpenIR  > 模式识别实验室
BEVBert: Multimodal Map Pre-training for Language-guided Navigation
Dong An; Yuankai Qi; Yangguang Li; Yan Huang; Liang Wang; Tieniu Tan; Jing Shao
2023-10
会议名称IEEE International Conference on Computer Vision
会议录名称Proceedings of the IEEE International Conference on Computer Vision
会议日期2023-10-2
会议地点Paris, France
摘要

Large-scale pre-training has shown promising results on the vision-and-language navigation (VLN) task. However, most existing pre-training methods employ discrete panoramas to learn visual-textual associations. This requires the model to implicitly correlate incomplete, duplicate observations within the panoramas, which may impair an agent’s spatial understanding. Thus, we propose a new map-based pre-training paradigm that is spatial-aware for use in VLN. Concretely, we build a local metric map to explicitly aggregate incomplete observations and remove duplicates, while modeling navigation dependency in a global topological map. This hybrid design can balance the demand of VLN for both short-term reasoning and long-term planning. Then, based on the hybrid map, we devise a pre-training framework to learn a multimodal map representation, which enhances spatial-aware cross-modal reasoning thereby facilitating the language-guided navigation goal. Extensive experiments demonstrate the effectiveness of the map-based pre-training route for VLN, and the proposed method achieves state-of-the-art on four VLN benchmarks.

收录类别EI
语种英语
是否为代表性论文
七大方向——子方向分类机器人感知与决策
国重实验室规划方向分类多模态协同认知
是否有论文关联数据集需要存交
文献类型会议论文
条目标识符http://ir.ia.ac.cn/handle/173211/56611
专题模式识别实验室
作者单位1.Institute of Automation, Chinese Academy of Sciences
2.School of Future Technology, UCAS
3.Australian Institute for Machine Learning, University of Adelaide
4.SenseTime Research
5.Nanjing University
6.Shanghai AI Laboratory
推荐引用方式
GB/T 7714
Dong An,Yuankai Qi,Yangguang Li,et al. BEVBert: Multimodal Map Pre-training for Language-guided Navigation[C],2023.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
bevbert.pdf(1722KB)会议论文 开放获取CC BY-NC-SA浏览
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Dong An]的文章
[Yuankai Qi]的文章
[Yangguang Li]的文章
百度学术
百度学术中相似的文章
[Dong An]的文章
[Yuankai Qi]的文章
[Yangguang Li]的文章
必应学术
必应学术中相似的文章
[Dong An]的文章
[Yuankai Qi]的文章
[Yangguang Li]的文章
相关权益政策
暂无数据
收藏/分享
文件名: bevbert.pdf
格式: Adobe PDF
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。