CASIA OpenIR  > 多模态人工智能系统全国重点实验室  > 视频内容安全
PolarFormer: Multi-Camera 3D Object Detection with Polar Transformer
Jiang, Yanqin1,4; Zhang, Li2; Miao, Zhenwei5; Zhu, Xiatian6; Gao, Jin1,4; Hu, Weiming1,4,7; Jiang, Yu-Gang3
2023
Conference Name37th AAAI Conference on Artificial Intelligence, AAAI 2023
Conference DateFebruary 7, 2023 - February 14, 2023
Conference PlaceWashington, DC, United states
Abstract

3D object detection in autonomous driving aims to reason "what" and "where" the objects of interest present in a 3Dworld. Following the conventional wisdom of previous 2D object detection, existing methods often adopt the canonical Cartesian coordinate system with perpendicular axis. However, we conjugate that this does not fit the nature of the ego car’s perspective, as each onboard camera perceives the world in shape of wedge intrinsic to the imaging geometry with radical (non-perpendicular) axis. Hence, in this paper we advocate the exploitation of the Polar coordinate system and propose a new Polar Transformer (PolarFormer) for more accurate 3D object detectionin the bird’s-eye-view (BEV) taking as input only multi-camera 2D images. Specifically, we design a cross-attention based Polar detection head without restriction to the shape of input structure to deal with irregular Polar grids. For tackling the unconstrained object scale variations along Polar’s distance dimension, we further introduce a multi-scale Polar representation learning strategy. As a result, our model can make best use of the Polar representation rasterized via attending to the corresponding image observation in a sequence-to-sequence fashion subject to the geometric constraints. Thorough experiments on the nuScenes dataset demonstrate that our PolarFormeroutperforms significantly state-of-the-art 3D object detection alternatives.

Indexed ByEI
Sub direction classification目标检测、跟踪与识别
planning direction of the national heavy laboratory实体人工智能系统感认知
Paper associated data
Document Type会议论文
Identifierhttp://ir.ia.ac.cn/handle/173211/57503
Collection多模态人工智能系统全国重点实验室_视频内容安全
Affiliation1.NLPR, Institute of Automation, Chinese Academy of Sciences, China
2.School of Data Science, Fudan University, China
3.School of Computer Science, Fudan University, China
4.School of Artificial Intelligence, University of Chinese Academy of Sciences, China
5.Alibaba DAMO Academy
6.Surrey Institute for People-Centred Artificial Intelligence, CVSSP, University of Surrey, United Kingdom
7.School of Information Science and Technology, ShanghaiTech University, China
First Author AffilicationChinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China
Recommended Citation
GB/T 7714
Jiang, Yanqin,Zhang, Li,Miao, Zhenwei,et al. PolarFormer: Multi-Camera 3D Object Detection with Polar Transformer[C],2023.
Files in This Item: Download All
File Name/Size DocType Version Access License
25185-Article Text-2(14499KB)会议论文 开放获取CC BY-NC-SAView Download
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Jiang, Yanqin]'s Articles
[Zhang, Li]'s Articles
[Miao, Zhenwei]'s Articles
Baidu academic
Similar articles in Baidu academic
[Jiang, Yanqin]'s Articles
[Zhang, Li]'s Articles
[Miao, Zhenwei]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Jiang, Yanqin]'s Articles
[Zhang, Li]'s Articles
[Miao, Zhenwei]'s Articles
Terms of Use
No data!
Social Bookmark/Share
File name: 25185-Article Text-29248-1-2-20230626.pdf
Format: Adobe PDF
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.