Cross-modal Learning for Event-based Semantic Segmentation via Attention Soft Alignment

doi:10.1109/LRA.2024.3355648

CASIA OpenIR > 多模态人工智能系统全国重点实验室 > 三维可视计算

	Cross-modal Learning for Event-based Semantic Segmentation via Attention Soft Alignment
	Chuyun Xie1,2 ; Wei Gao1,2 ; Ren Guo1,2
发表期刊	IEEE ROBOTICS AND AUTOMATION LETTERS
ISSN	2377-3766
	2024
卷号	9 期号:3 页码:2359-2366
通讯作者	Gao, Wei(wei.gao@ia.ac.cn)
文章类型	期刊论文
摘要	By demonstrating robustness in scenarios charac- terized by high-speed motion and extreme lighting changes, event cameras hold great potential for enhancing the perception reliability of autonomous driving systems. Because of its novelty and data sparsity, the progress of event-based algorithms is hindered by the scarcity of high-quality labeled datasets. In this work, we propose CMESS (Cross-Modal learning for Event-based Semantic Segmentation), which eliminates the need for event labels by transferring the model from labeled image datasets (source domain) to unlabeled event datasets (target domain) via unsupervised domain adaptation (UDA). Compared to existing UDA methods that require hard alignment of visually consistent embeddings, our approach achieves soft alignment via cross- attention and then augments it with knowledge distillation to convey fine-grained source knowledge to the target domain. Additionally, we introduce an event-driven bidirectional self- labeling method to generate weakly supervised signals for event- only datasets. These designs facilitate cross-modal learning with- out requiring per-pixel paired frames or online reconstruction. Experimental results show that our method outperforms existing state-of-the-art methods in both UDA and supervised settings on common evaluation benchmarks, making it a universal frame- work for further unlabeled event-related visual tasks.
关键词	Deep Learning for Visual Perception, Transfer Learning, Semantic Scene Understanding
学科门类	工学
DOI	10.1109/LRA.2024.3355648
URL	查看原文
收录类别	SCI
语种	中文
资助项目	National Key Ramp;D Program of China
项目资助者	National Key Ramp;D Program of China
WOS研究方向	Robotics
WOS类目	Robotics
WOS记录号	WOS:001167554600009
出版者	IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
是否为代表性论文	是
七大方向——子方向分类	多模态智能
国重实验室规划方向分类	虚实融合与迁移学习
是否有论文关联数据集需要存交	否
中文导读	无
视频解析	无
引用统计
文献类型	期刊论文
条目标识符	http://ir.ia.ac.cn/handle/173211/56682
专题	多模态人工智能系统全国重点实验室_三维可视计算
通讯作者	Wei Gao
作者单位	1.Institute of Automation, Chinese Academy of Sciences 2.School of Artificial Intelligence, University of Chinese Academy of Sciences
第一作者单位	中国科学院自动化研究所
通讯作者单位	中国科学院自动化研究所
推荐引用方式 GB/T 7714	Chuyun Xie,Wei Gao,Ren Guo. Cross-modal Learning for Event-based Semantic Segmentation via Attention Soft Alignment[J]. IEEE ROBOTICS AND AUTOMATION LETTERS,2024,9(3):2359-2366.
APA	Chuyun Xie,Wei Gao,&Ren Guo.(2024).Cross-modal Learning for Event-based Semantic Segmentation via Attention Soft Alignment.IEEE ROBOTICS AND AUTOMATION LETTERS,9(3),2359-2366.
MLA	Chuyun Xie,et al."Cross-modal Learning for Event-based Semantic Segmentation via Attention Soft Alignment".IEEE ROBOTICS AND AUTOMATION LETTERS 9.3(2024):2359-2366.

条目包含的文件		下载所有文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
P1mgp.pdf（5267KB）	期刊论文	作者接受稿	开放获取	CC BY-NC-SA	浏览下载