CANet: Co-attention network for RGB-D semantic segmentation | |
Zhou, Hao1,3,4; Qi, Lu2![]() ![]() | |
Source Publication | PATTERN RECOGNITION
![]() |
ISSN | 0031-3203 |
2022-04-01 | |
Volume | 124Pages:11 |
Corresponding Author | Huang, Hai(haihus@163.com) |
Abstract | Incorporating the depth (D) information to RGB images has proven the effectiveness and robustness in semantic segmentation. However, the fusion between them is not trivial due to their inherent physical meaning discrepancy, in which RGB represents RGB information but D depth information. In this paper, we propose a co-attention network (CANet) to build sound interaction between RGB and depth features. The key part in the CANet is the co-attention fusion part. It includes three modules. Specifically, the po-sition and channel co-attention fusion modules adaptively fuse RGB and depth features in spatial and channel dimensions. An additional fusion co-attention module further integrates the outputs of the posi-tion and channel co-attention fusion modules to obtain a more representative feature which is used for the final semantic segmentation. Extensive experiments witness the effectiveness of the CANet in fus-ing RGB and depth features, achieving state-of-the-art performance on two challenging RGB-D semantic segmentation datasets, i.e., NYUDv2 and SUN-RGBD. (c) 2021 Elsevier Ltd. All rights reserved. |
Keyword | RGB-D Multi -modal fusion Co-attention Semantic segmentation |
DOI | 10.1016/j.patcog.2021.108468 |
WOS Keyword | FEATURES |
Indexed By | SCI |
Language | 英语 |
Funding Project | National Natural Science Foundation (NSFC) of China[61633009] ; National Natural Science Foundation (NSFC) of China[61973301] ; National Natural Science Foundation (NSFC) of China[61972020] ; National Natural Science Foundation (NSFC) of China[51579053] ; National Natural Science Foundation (NSFC) of China[51779058] ; Beijing Science and Technology Plan Project[Z18110 0 0 08918018] ; National Key R&D Program of China[2016YFC0300801] ; National Key R&D Program of China[2017YFB1300202] ; National Key R&D Program of China[2020AAA0108902] |
Funding Organization | National Natural Science Foundation (NSFC) of China ; Beijing Science and Technology Plan Project ; National Key R&D Program of China |
WOS Research Area | Computer Science ; Engineering |
WOS Subject | Computer Science, Artificial Intelligence ; Engineering, Electrical & Electronic |
WOS ID | WOS:000736972200013 |
Publisher | ELSEVIER SCI LTD |
Sub direction classification | 图像视频处理与分析 |
Citation statistics | |
Document Type | 期刊论文 |
Identifier | http://ir.ia.ac.cn/handle/173211/47131 |
Collection | 复杂系统管理与控制国家重点实验室_机器人理论与应用 |
Corresponding Author | Huang, Hai |
Affiliation | 1.Harbin Engn Univ, Natl Key Lab Sci & Technol Underwater Vehicle, Harbin, Peoples R China 2.Chinese Univ Hong Kong, Hong Kong, Peoples R China 3.Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing, Peoples R China 4.Univ Chinese Acad Sci, Beijing, Peoples R China 5.Jihua Lab, Foshan, Peoples R China |
First Author Affilication | Institute of Automation, Chinese Academy of Sciences |
Recommended Citation GB/T 7714 | Zhou, Hao,Qi, Lu,Huang, Hai,et al. CANet: Co-attention network for RGB-D semantic segmentation[J]. PATTERN RECOGNITION,2022,124:11. |
APA | Zhou, Hao,Qi, Lu,Huang, Hai,Yang, Xu,Wan, Zhaoliang,&Wen, Xianglong.(2022).CANet: Co-attention network for RGB-D semantic segmentation.PATTERN RECOGNITION,124,11. |
MLA | Zhou, Hao,et al."CANet: Co-attention network for RGB-D semantic segmentation".PATTERN RECOGNITION 124(2022):11. |
Files in This Item: | There are no files associated with this item. |
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment