Dense Attention: A Densely Connected Attention Mechanism for Vision Transformer

CASIA OpenIR > 多模态人工智能系统全国重点实验室 > 深度强化学习

	Dense Attention: A Densely Connected Attention Mechanism for Vision Transformer
	Nannan Li1,2 ; Yaran Chen1,2 ; Dongbin Zhao1,2
	2023-05
会议名称	2023 International Joint Conference on Neural Networks (IJCNN)
会议日期	June 18 - 23, 2023
会议地点	Queensland, Australia
摘要	Recently, Vision Transformer has demonstrated its impressive capability in image understanding. The multi-head self-attention mechanism is fundamental to its formidable performance. However, self-attention has the drawback of high computational effort, which makes the training of the model require powerful computational resources or more time. This paper designs a novel and efficient attention mechanism Dense Attention to overcome the above problem. Dense attention aims to focus on features from multiple views through a dense connection paradigm. Benefiting from the attention of comprehensive features, dense attention can i) remarkably strengthen the image representation of the model, and ii) partially replace the multihead self-attention mechanism to allow model slimming. To verify the effectiveness of dense attention, we implement it in the prevalent Vision Transformer models, including non-pyramid architecture DeiT and pyramid architecture Swin Transformer. The experimental results on ImageNet classification show that dense attention indeed contributes to performance improvement, +1.8/1.3% for DeiT-T/S and +0.7/+1.2% for Swin-T/S, respectively. Dense attention also demonstrates its transferability on CIFAR10 and CIFAR100 recognition benchmarks with classification accuracy of 98.9% and 89.6% respectively. Furthermore, dense attention can weaken the performance sacrifice caused by the pruning in the number of heads. Code and pre-trained models will be available.
收录类别	EI
语种	英语
七大方向——子方向分类	计算智能
国重实验室规划方向分类	智能计算与学习
是否有论文关联数据集需要存交	否
文献类型	会议论文
条目标识符	http://ir.ia.ac.cn/handle/173211/52213
专题	多模态人工智能系统全国重点实验室_深度强化学习
作者单位	1.The State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences 2.School of artificial intelligence, University of Chinese Academy of Sciences
第一作者单位	中国科学院自动化研究所
推荐引用方式 GB/T 7714	Nannan Li,Yaran Chen,Dongbin Zhao. Dense Attention: A Densely Connected Attention Mechanism for Vision Transformer[C],2023.

条目包含的文件		下载所有文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
a861-li final.pdf（3683KB）	会议论文		开放获取	CC BY-NC-SA	浏览下载