Dual-Path Transformer for 3D Human Pose Estimation

CASIA OpenIR > 紫东太初大模型研究中心

	Dual-Path Transformer for 3D Human Pose Estimation
	Zhou Lu1 ; Chen Yingying1 ; Wang Jinqiao1,2,3,4
发表期刊	IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
	2024
卷号	34 期号:5 页码:3260-3270
摘要	Video-based 3D human pose estimation has achieved great progress, however, it is still difficult to learn precise 2D-3D projection under some hard cases. Multi-level human knowledge and motion information serve as two key elements in the field to conquer the challenges caused by various factors, where the former encodes various human structure information spatially and the latter captures the motion change temporally. Inspired by this, we propose a DualFormer (dual-path transformer) network which encodes multiple human contexts and motion detail to perform the spatial-temporal modeling. Firstly, motion information which depicts the movement change of human body is embedded to provide explicit motion prior for the transformer module. Secondly, a dual-path transformer framework is proposed to model long-range dependencies of both joint sequence and limb sequence. Parallel context embedding is performed initially and a cross transformer block is then appended to promote the interaction of the dual paths which improves the feature robustness greatly. Specifically, predic tions of multiple levels can be acquired simultaneously. Lastly, we employ the weighted distillation technique to accelerate the convergence of the dual-path framework. We conduct extensive experiments on three different benchmarks, i.e., Human 3.6M, MPI-INF-3DHP and HumanEva-I. We mainly compute the MPJPE, P-MPJPE, PCK and AUC to evaluate the effective ness of proposed approach and our work achieves competitive results compared with state-of-the-art approaches. Specifically, the MPJPE is reduced to 42.8mm which is 1.5mm lower than PoseFormer on Human3.6M, which proves the efficacy of the proposed approach.
收录类别	SCI
七大方向——子方向分类	图像视频处理与分析
国重实验室规划方向分类	视觉信息处理
是否有论文关联数据集需要存交	否
文献类型	期刊论文
条目标识符	http://ir.ia.ac.cn/handle/173211/57148
专题	紫东太初大模型研究中心
通讯作者	Chen Yingying
作者单位	1.Foundation Model Research Center, Institute of Automation, Chinese Academy of Sciences 2.School of Artificial Intelligence, University of Chinese Academy of Sciences 3.Wuhan AI Research 4.Peng Cheng Laboratory
第一作者单位	中国科学院自动化研究所
通讯作者单位	中国科学院自动化研究所
推荐引用方式 GB/T 7714	Zhou Lu,Chen Yingying,Wang Jinqiao. Dual-Path Transformer for 3D Human Pose Estimation[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY,2024,34(5):3260-3270.
APA	Zhou Lu,Chen Yingying,&Wang Jinqiao.(2024).Dual-Path Transformer for 3D Human Pose Estimation.IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY,34(5),3260-3270.
MLA	Zhou Lu,et al."Dual-Path Transformer for 3D Human Pose Estimation".IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 34.5(2024):3260-3270.

条目包含的文件		下载所有文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
Dual-Path_Transforme（2410KB）	期刊论文	作者接受稿	开放获取	CC BY-NC-SA	浏览下载