CASIA OpenIR  > 智能感知与计算研究中心
Efficient spatiotemporal context modeling for action recognition
Cao, Congqi1,3,4; Lu, Yue1; Zhang, Yifan2,5,6; Jiang, Dongmei1; Zhang, Yanning1
发表期刊NEUROCOMPUTING
ISSN0925-2312
2023-08-07
卷号545页码:13
通讯作者Cao, Congqi(congqi.cao@nwpu.edu.cn)
摘要Contextual information is essential in action recognition. However, local operations have difficulty in modeling two distant elements, and directly computing the dense relations between any two points brings huge computation and memory burden. Inspired by the recurrent 2D criss-cross attention (RCCA-2D) in image segmentation, we propose a recurrent 3D criss-cross attention (RCCA-3D) that factorizes the global relation map into sparse relation maps to model long-range spatiotemporal context with minor costs for video-based action recognition. Specifically, we first propose a 3D criss-cross attention (CCA-3D) module. Compared with the CCA-2D which only works in space, it can capture the spatiotemporal relationship between the points in the same line along the direction of width, height and time. However, only replacing the two CCA-2Ds in the RCCA-2D with our CCA-3Ds cannot model the spatiotemporal context in videos. Therefore, we further duplicate the CCA-3D with a recurrent mechanism to transmit the relation between the points in a line to a plane and finally to the whole spatiotemporal space. To make the RCCA-3D adaptive for action recognition, we propose a novel recurrent structure rather than directly extending the original 2D structure to 3D. In the experiments, we make a thorough analysis of different structures of RCCA-3D, verifying the proposed structure is more suitable for action recognition. We also compare our RCCA-3D with the non-local attention, showing that the RCCA-3D requires 25% fewer parameters and 30% fewer FLOPs with even higher accuracy. Finally, equipped with our RCCA-3D, 3 networks achieve better and leading performance on 5 RGB-based and skeleton-based datasets.
关键词Action recognition Long -range context modeling Spatiotemporal feature map Attention module Relation
DOI10.1016/j.neucom.2023.126289
收录类别SCI
语种英语
资助项目National Natural Science Foundation of China[U19B2037] ; National Natural Science Foundation of China[62273347] ; National Natural Science Foundation of China[61906155] ; National Key R&D Program of China[2020AAA0106900] ; Key R&D Project in Shaanxi Province[2023-YBGY-240] ; Young Talent Fund of Association for Science and Technology in Shaanxi, China[20220117]
项目资助者National Natural Science Foundation of China ; National Key R&D Program of China ; Key R&D Project in Shaanxi Province ; Young Talent Fund of Association for Science and Technology in Shaanxi, China
WOS研究方向Computer Science
WOS类目Computer Science, Artificial Intelligence
WOS记录号WOS:001000901900001
出版者ELSEVIER
引用统计
文献类型期刊论文
条目标识符http://ir.ia.ac.cn/handle/173211/53433
专题智能感知与计算研究中心
通讯作者Cao, Congqi
作者单位1.Northwestern Polytech Univ, Sch Comp Sci, Xian, Peoples R China
2.Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing, Peoples R China
3.Northwestern Polytech Univ, Natl Engn Lab Integrated Aerosp Ground Ocean Big D, Xian 710129, Peoples R China
4.Northwestern Polytech Univ, Sch Comp Sci, Shaanxi Prov Key Lab Speech & Image Informat Proc, Xian 710129, Peoples R China
5.Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China
6.Univ Chinese Acad Sci, Beijing 100190, Peoples R China
推荐引用方式
GB/T 7714
Cao, Congqi,Lu, Yue,Zhang, Yifan,et al. Efficient spatiotemporal context modeling for action recognition[J]. NEUROCOMPUTING,2023,545:13.
APA Cao, Congqi,Lu, Yue,Zhang, Yifan,Jiang, Dongmei,&Zhang, Yanning.(2023).Efficient spatiotemporal context modeling for action recognition.NEUROCOMPUTING,545,13.
MLA Cao, Congqi,et al."Efficient spatiotemporal context modeling for action recognition".NEUROCOMPUTING 545(2023):13.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Cao, Congqi]的文章
[Lu, Yue]的文章
[Zhang, Yifan]的文章
百度学术
百度学术中相似的文章
[Cao, Congqi]的文章
[Lu, Yue]的文章
[Zhang, Yifan]的文章
必应学术
必应学术中相似的文章
[Cao, Congqi]的文章
[Lu, Yue]的文章
[Zhang, Yifan]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。