Knowledge Commons of Institute of Automation,CAS
Attentional Composition Networks for Long-Tailed Human Action Recognition | |
Wang, Haoran1; Wang, Yajie1; Yu, Baosheng2; Zhan, Yibing3; Yuan, Chunfeng4![]() | |
发表期刊 | ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS
![]() |
ISSN | 1551-6857 |
2024 | |
卷号 | 20期号:1页码:18 |
通讯作者 | Wang, Haoran(wanghaoran@ise.neu.edu.cn) |
摘要 | The problem of long-tailed visual recognition has been receiving increasing research attention. However, the long-tailed distribution problem remains underexplored for video-based visual recognition. To address this issue, in this article we propose a compositional learning based solution for video-based human action recognition. Our method, named Attentional Composition Networks (ACN), first learns verb-like and prepositionlike components, then shuffles these components to generate samples for the tail classes in the feature space to augment the data for the tail classes. Specifically, during training, we represent each action video by a graph that captures the spatial-temporal relations (edges) among detected human/object instances (nodes). Then, ACN utilizes the position information to decompose each action into a set of verb and preposition representations using the edge features in the graph. After that, the verb and preposition features from different videos are combined via an attention structure to synthesize feature representations for tail classes. This way, we can enrich the data for the tail classes and consequently improve the action recognition for these classes. To evaluate the compositional human action recognition, we further contribute a new human action recognition dataset, namely NEU-Interaction (NEU-I). Experimental results on both Something-Something V2 and the proposed NEU-I demonstrate the effectiveness of the proposed method for long-tailed, few-shot, and zero-shot problems in human action recognition. Source code and the NEU-I dataset are available at https://github.com/YajieW99/ACN. |
关键词 | Compositional learning long tail few-shot zero-shot action recognition |
DOI | 10.1145/3603253 |
收录类别 | SCI |
语种 | 英语 |
资助项目 | Major Science and Technology Innovation 2030 New Generation Artificial Intelligence key project[2021ZD0111700] ; Fundamental Research Funds for the Central Universities of China[N2304012] ; National Nature Science Foundation of China[61773117] ; National Nature Science Foundation of China[61972397] ; National Nature Science Foundation of China[62276061] ; National Nature Science Foundation of China[62002090] |
项目资助者 | Major Science and Technology Innovation 2030 New Generation Artificial Intelligence key project ; Fundamental Research Funds for the Central Universities of China ; National Nature Science Foundation of China |
WOS研究方向 | Computer Science |
WOS类目 | Computer Science, Information Systems ; Computer Science, Software Engineering ; Computer Science, Theory & Methods |
WOS记录号 | WOS:001080441800008 |
出版者 | ASSOC COMPUTING MACHINERY |
引用统计 | |
文献类型 | 期刊论文 |
条目标识符 | http://ir.ia.ac.cn/handle/173211/52979 |
专题 | 多模态人工智能系统全国重点实验室 |
通讯作者 | Wang, Haoran |
作者单位 | 1.Northeastern Univ, Coll Informat Sci & Engn, Shenyang 110819, Peoples R China 2.Univ Sydney, Sch Comp Sci, Fac Engn, Darlington, NSW 2008, Australia 3.JD Explore Acad, Beijing 100176, Peoples R China 4.Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China 5.Southeast Univ, Sch Automat, Nanjing, Peoples R China |
推荐引用方式 GB/T 7714 | Wang, Haoran,Wang, Yajie,Yu, Baosheng,et al. Attentional Composition Networks for Long-Tailed Human Action Recognition[J]. ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS,2024,20(1):18. |
APA | Wang, Haoran,Wang, Yajie,Yu, Baosheng,Zhan, Yibing,Yuan, Chunfeng,&Yang, Wankou.(2024).Attentional Composition Networks for Long-Tailed Human Action Recognition.ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS,20(1),18. |
MLA | Wang, Haoran,et al."Attentional Composition Networks for Long-Tailed Human Action Recognition".ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS 20.1(2024):18. |
条目包含的文件 | 条目无相关文件。 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论