CASIA OpenIR  > 模式识别实验室
Bi-Directional and Early Interaction Transformers for Bird’s Eye View Semantic Segmentation
Pan Cong1,2; He Yonghao3; Peng Junran4; Zhang Qian3; Sui Wei3; Zhang Zhaoxiang1,2,5
2023-06
Conference NameComputer Vision and Pattern Recognition
Conference Date2023 年 6 月 18 日 – 2023 年 6 月 22 日
Conference PlaceVancouver Convention Center
PublisherIEEE/CVF
Abstract

Bird's Eye View (BEV) semantic segmentation is a critical task in autonomous driving. However, existing Transformer-based methods confront difficulties in transforming Perspective View (PV) to BEV due to their unidirectional and posterior interaction mechanisms. To address this issue, we propose a novel Bi-directional and Early Interaction Transformers framework named BAEFormer, consisting of (i) an early-interaction PV-BEV pipeline and (ii) a bi-directional cross-attention mechanism. Moreover, we find that the image feature maps' resolution in the cross-attention module has a limited effect on the final performance. Under this critical observation, we propose to enlarge the size of input images and downsample the multi-view image features for cross-interaction, further improving the accuracy while keeping the amount of computation controllable. Our proposed method for BEV semantic segmentation achieves state-of-the-art performance in real-time inference speed on the nuScenes dataset, i.e., 38.9 mIoU at 45 FPS on a single A100 GPU.

Indexed ByEI
Sub direction classification三维视觉
planning direction of the national heavy laboratory环境多维感知
Paper associated data
Document Type会议论文
Identifierhttp://ir.ia.ac.cn/handle/173211/57377
Collection模式识别实验室
Corresponding AuthorZhang Zhaoxiang
Affiliation1.National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences
2.School of Future Technology, University of Chinese Academy of Sciences
3.Horizon Robotics
4.Huawei Inc.
5.Center for Artificial Intelligence and Robotics, HKISI CAS
First Author AffilicationChinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China
Corresponding Author AffilicationChinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China
Recommended Citation
GB/T 7714
Pan Cong,He Yonghao,Peng Junran,等. Bi-Directional and Early Interaction Transformers for Bird’s Eye View Semantic Segmentation[C]:IEEE/CVF,2023.
Files in This Item: Download All
File Name/Size DocType Version Access License
Pan_BAEFormer_Bi-Dir(2215KB)会议论文 开放获取CC BY-NC-SAView Download
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Pan Cong]'s Articles
[He Yonghao]'s Articles
[Peng Junran]'s Articles
Baidu academic
Similar articles in Baidu academic
[Pan Cong]'s Articles
[He Yonghao]'s Articles
[Peng Junran]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Pan Cong]'s Articles
[He Yonghao]'s Articles
[Peng Junran]'s Articles
Terms of Use
No data!
Social Bookmark/Share
File name: Pan_BAEFormer_Bi-Directional_and_Early_Interaction_Transformers_for_Birds_Eye_View_CVPR_2023_paper.pdf
Format: Adobe PDF
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.