Knowledge Commons of Institute of Automation,CAS
BSTG-Trans: A Bayesian Spatial-Temporal Graph Transformer for Long-term Pose Forecasting | |
Shentong Mo2; Xin M(辛淼)1 | |
发表期刊 | IEEE Transactions on Multimedia |
ISSN | 1520-9210 |
2023 | |
卷号 | Early Access期号:Early Access页码:Early Access |
通讯作者 | Xin, Miao(miao.xin@ia.ac.cn) |
摘要 | Human pose forecasting that aims to predict the body poses happening in the future is an important task in computer vision. However, long-term pose forecasting is particularly challenging because modeling long-range dependencies across the spatial-temporal level is hard for joint-based representation. Another challenge is uncertainty prediction since the future prediction is not a deterministic process. In this work, we present a novel B ayesian S patial- T emporal G raph Trans former (BSTG-Trans) for predicting accurate, diverse, and uncertain future poses. First, we apply a spatial-temporal graph transformer as an encoder and a temporal-spatial graph transformer as a decoder for modeling the long-range spatial-temporal dependencies across pose joints to generate the long-term future body poses. Furthermore, we propose a Bayesian sampling module for uncertainty quantization of diverse future poses. Finally, a novel uncertainty estimation metric, namely Uncertainty Absolute Error is introduced for measuring both the accuracy and uncertainty of each predicted future pose. We achieve state-of-the-art performance against other baselines on Human3.6M and HumanEva-I in terms of accuracy, diversity, and uncertainty for long-term pose forecasting. Moreover, our comprehensive ablation studies demonstrate the effectiveness and generalization of each module proposed in our BSTG-Trans. Code and models are available at https://github.com/stoneMo/BSTG-Trans . |
关键词 | long-term forecasting spatial-temporal graph transformer Bayesian transformer uncertainty estimation |
DOI | 10.1109/TMM.2023.3269219 |
收录类别 | SCI |
语种 | 英语 |
资助项目 | National Natural Science Foundation of China |
项目资助者 | National Natural Science Foundation of China |
WOS研究方向 | Computer Science ; Telecommunications |
WOS类目 | Computer Science, Information Systems ; Computer Science, Software Engineering ; Telecommunications |
WOS记录号 | WOS:001157873000019 |
出版者 | IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC |
七大方向——子方向分类 | 智能交互 |
国重实验室规划方向分类 | 人机混合智能 |
是否有论文关联数据集需要存交 | 否 |
引用统计 | |
文献类型 | 期刊论文 |
条目标识符 | http://ir.ia.ac.cn/handle/173211/51503 |
专题 | 复杂系统认知与决策实验室 中国科学院自动化研究所 复杂系统认知与决策实验室_高效智能计算与学习 |
通讯作者 | Xin M(辛淼) |
作者单位 | 1.Institute of Automation, Chinese Academy of Sciences 2.Carnegie Mellon University |
通讯作者单位 | 中国科学院自动化研究所 |
推荐引用方式 GB/T 7714 | Shentong Mo,Xin M. BSTG-Trans: A Bayesian Spatial-Temporal Graph Transformer for Long-term Pose Forecasting[J]. IEEE Transactions on Multimedia,2023,Early Access(Early Access):Early Access. |
APA | Shentong Mo,&Xin M.(2023).BSTG-Trans: A Bayesian Spatial-Temporal Graph Transformer for Long-term Pose Forecasting.IEEE Transactions on Multimedia,Early Access(Early Access),Early Access. |
MLA | Shentong Mo,et al."BSTG-Trans: A Bayesian Spatial-Temporal Graph Transformer for Long-term Pose Forecasting".IEEE Transactions on Multimedia Early Access.Early Access(2023):Early Access. |
条目包含的文件 | 下载所有文件 | |||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 | ||
BSTG-Trans_A_Bayesia(2209KB) | 期刊论文 | 作者接受稿 | 开放获取 | CC BY-NC-SA | 浏览 下载 |
个性服务 |
推荐该条目 |
保存到收藏夹 |
查看访问统计 |
导出为Endnote文件 |
谷歌学术 |
谷歌学术中相似的文章 |
[Shentong Mo]的文章 |
[Xin M(辛淼)]的文章 |
百度学术 |
百度学术中相似的文章 |
[Shentong Mo]的文章 |
[Xin M(辛淼)]的文章 |
必应学术 |
必应学术中相似的文章 |
[Shentong Mo]的文章 |
[Xin M(辛淼)]的文章 |
相关权益政策 |
暂无数据 |
收藏/分享 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论