Gesture recognition based on deep deformable 3D convolutional neural networks
Zhang, Yifan1,2,3; Shi, Lei1,2,3; Wu, Yi4; Cheng, Ke1,2,3; Cheng, Jian1,2,3,5; Lu, Hanqing1,2,3
发表期刊PATTERN RECOGNITION
ISSN0031-3203
2020-11-01
期号107页码:12
摘要

Dynamic gesture recognition, which plays an essential role in human-computer interaction, has been widely investigated but not yet fully addressed. The challenge mainly lies in three folders: 1) to model both of the spatial appearance and the temporal evolution simultaneously; 2) to address the interference from the varied and complex background; 3) the requirement of real-time processing. In this paper, we address the above challenges by proposing a novel deep deformable 3D convolutional neural network for end-to-end learning, which not only gains impressive accuracy in challenging datasets but also can meet the requirement of the real-time processing. We propose three types of very deep 3D CNNs for gesture recognition, which can directly model the spatiotemporal information with their inherent hierarchical structure. To eliminate the background interference, a light-weight spatiotemporal deformable convolutional module is specially designed to augment the spatiotemporal sampling locations of the 3D convolution by learning additional offsets according to the preceding feature map. It can not only diversify the shape of the convolution kernel to better fit the appearance of the hands and arms, but also help the models pay more attention to the discriminative frames in the video sequence. The proposed method is evaluated on three challenging datasets, EgoGesture, Jester and Chalearn-IsoGD, and achieves the state-of-the-art performance on all of them. Our model ranked first on Jester's official leader-board until the submission time. The code and the trained models are released for better communication and future works(1). (C) 2020 Elsevier Ltd. All rights reserved.

关键词Gesture recognition Spatiotemporal deformable convolution Spatiotemporal convolutional neural network
DOI10.1016/j.patcog.2020.107416
关键词[WOS]DATASET ; FUSION ; TIME
收录类别SCI
语种英语
资助项目NSFC[61876182] ; NSFC[61872364] ; NSFC[61876086] ; Jiangsu Frontier Technology Basic Research Project[BK20192004]
项目资助者NSFC ; Jiangsu Frontier Technology Basic Research Project
WOS研究方向Computer Science ; Engineering
WOS类目Computer Science, Artificial Intelligence ; Engineering, Electrical & Electronic
WOS记录号WOS:000552866000006
出版者ELSEVIER SCI LTD
七大方向——子方向分类图像视频处理与分析
引用统计
被引频次:30[WOS]   [WOS记录]     [WOS相关记录]
文献类型期刊论文
条目标识符http://ir.ia.ac.cn/handle/173211/40290
专题紫东太初大模型研究中心_图像与视频分析
复杂系统认知与决策实验室_高效智能计算与学习
通讯作者Zhang, Yifan
作者单位1.Chinese Acad Sci, Inst Automat, NLPR, Beijing, Peoples R China
2.Chinese Acad Sci, Inst Automat, AIRIA, Beijing, Peoples R China
3.Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing, Peoples R China
4.Wormpex AI Res, Bellevue, WA USA
5.CAS Ctr Excellence Brain Sci & Intelligence Techn, Beijing, Peoples R China
第一作者单位模式识别国家重点实验室;  中国科学院自动化研究所
通讯作者单位模式识别国家重点实验室;  中国科学院自动化研究所
推荐引用方式
GB/T 7714
Zhang, Yifan,Shi, Lei,Wu, Yi,et al. Gesture recognition based on deep deformable 3D convolutional neural networks[J]. PATTERN RECOGNITION,2020(107):12.
APA Zhang, Yifan,Shi, Lei,Wu, Yi,Cheng, Ke,Cheng, Jian,&Lu, Hanqing.(2020).Gesture recognition based on deep deformable 3D convolutional neural networks.PATTERN RECOGNITION(107),12.
MLA Zhang, Yifan,et al."Gesture recognition based on deep deformable 3D convolutional neural networks".PATTERN RECOGNITION .107(2020):12.
条目包含的文件 下载所有文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
Online Version.pdf(1310KB)期刊论文作者接受稿开放获取CC BY-NC-SA浏览 下载
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Zhang, Yifan]的文章
[Shi, Lei]的文章
[Wu, Yi]的文章
百度学术
百度学术中相似的文章
[Zhang, Yifan]的文章
[Shi, Lei]的文章
[Wu, Yi]的文章
必应学术
必应学术中相似的文章
[Zhang, Yifan]的文章
[Shi, Lei]的文章
[Wu, Yi]的文章
相关权益政策
暂无数据
收藏/分享
文件名: Online Version.pdf
格式: Adobe PDF
此文件暂不支持浏览
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。