Knowledge Commons of Institute of Automation,CAS
SlowFastFormer for 3D human pose estimation | |
Zhou Lu1; Chen Yingying1; Wang Jinqiao1,2,3,4 | |
发表期刊 | Computer Vision and Image Understanding |
ISSN | 1077-3142 |
2024 | |
卷号 | 243期号:243页码:103992 |
通讯作者 | Chen, Yingying(yingying.chen@nlpr.ia.ac.cn) |
摘要 | 3D human pose estimation in videos aims at locating the human joints in the 3D space given a temporal
sequence. Motion information and skeleton context are two significant elements for pose estimation in videos.
In this paper, we propose a SlowFastFormer (slow-fast transformer) network where two branches with different
input rates are composed to encode these two different kinds of context. For the slow branch, skeleton context
is well learned at a higher frame rate. For the fast branch, motion information is captured at a lower frame rate.
Through these two branches, different kinds of context are encoded separately. We fuse these two branches
at a later stage to fully utilize the skeleton context and motion information. Afterwards, a blending module is
developed to promote the message exchange among multiple branches. In the blending stage, different kinds of
context information are exchanged and feature representation is enhanced consequently. Lastly, a hierarchical
supervision scheme is tailored where predictions of different levels are inferred in a progressive manner. Our
approach achieves competitive performance with lower computation complexity on several benchmarks, i.e.,
Human3.6M, MPI-INF-3DHP and HumanEva-I. |
关键词 | SlowFastFormer Transformer Blending 3D human pose estimation Hierarchical supervision |
DOI | 10.1016/j.cviu.2024.103992 |
收录类别 | SCI |
语种 | 英语 |
资助项目 | National Key Research and Development Program of China[2022ZD0160601] ; National Natural Science Foundation of China[62206283] ; National Natural Science Foundation of China[62276260] ; National Natural Science Foundation of China[62076235] ; National Natural Science Foundation of China[62176254] |
项目资助者 | National Key Research and Development Program of China ; National Natural Science Foundation of China |
WOS研究方向 | Computer Science ; Engineering |
WOS类目 | Computer Science, Artificial Intelligence ; Engineering, Electrical & Electronic |
WOS记录号 | WOS:001218515300001 |
出版者 | ACADEMIC PRESS INC ELSEVIER SCIENCE |
七大方向——子方向分类 | 图像视频处理与分析 |
国重实验室规划方向分类 | 视觉信息处理 |
是否有论文关联数据集需要存交 | 否 |
引用统计 | |
文献类型 | 期刊论文 |
条目标识符 | http://ir.ia.ac.cn/handle/173211/57150 |
专题 | 紫东太初大模型研究中心 |
通讯作者 | Chen Yingying |
作者单位 | 1.Foundation Model Research Center, Institute of Automation, Chinese Academy of Sciences 2.School of Artificial Intelligence, University of Chinese Academy of Sciences 3.Wuhan AI Research 4.Peng Cheng Laboratory |
第一作者单位 | 中国科学院自动化研究所 |
通讯作者单位 | 中国科学院自动化研究所 |
推荐引用方式 GB/T 7714 | Zhou Lu,Chen Yingying,Wang Jinqiao. SlowFastFormer for 3D human pose estimation[J]. Computer Vision and Image Understanding,2024,243(243):103992. |
APA | Zhou Lu,Chen Yingying,&Wang Jinqiao.(2024).SlowFastFormer for 3D human pose estimation.Computer Vision and Image Understanding,243(243),103992. |
MLA | Zhou Lu,et al."SlowFastFormer for 3D human pose estimation".Computer Vision and Image Understanding 243.243(2024):103992. |
条目包含的文件 | 下载所有文件 | |||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 | ||
SlowFastFormer for 3(989KB) | 期刊论文 | 作者接受稿 | 开放获取 | CC BY-NC-SA | 浏览 下载 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论