CASIA OpenIR  > 模式识别国家重点实验室
Spatiotemporal Distilled Dense-Connectivity Network for Video Action Recognition
Wangli Hao; Zhaoxiang Zhang
Source PublicationPattern Recognition
2019
Volume10Issue:20Pages:100-130
Abstract

Two-stream convolutional neural networks show great promise for action recognition tasks. However, most two-stream based approaches train the appearance and motion subnetworks independently, which may lead to the decline in performance due to the lack of interactions among two streams. To overcome this limitation, we propose a Spatiotemporal Distilled Dense-Connectivity Network (STDDCN) for video action recognition. This network implements both knowledge distillation and dense-connectivity (adapted from DenseNet). Using this STDDCN architecture, we aim to explore interaction strategies between appearance and motion streams along different hierarchies. Specifically, block-level dense connections between appearance and motion pathways enable spatiotemporal interaction at the feature representation
layers. Moreover, knowledge distillation among two streams (each treated as a student) and their last fusion (treated as teacher) allows both streams to interact at the high level layers. The special architecture of STDDCN allows it to gradually obtain effective hierarchical spatiotemporal features. Moreover, it can be trained end-to-end. Finally, numerous ablation studies validate the effectiveness and generalization of our model on two benchmark datasets, including UCF101 and HMDB51. Simultaneously, our model achieves promising performances.

 

KeywordTwo-stream Action Recognition Dense-connectivity Knowledge Distillation
Indexed BySCI
Document Type期刊论文
Identifierhttp://ir.ia.ac.cn/handle/173211/23347
Collection模式识别国家重点实验室
智能感知与计算研究中心
Corresponding AuthorZhaoxiang Zhang
Affiliation1.Center of Research on Intelligent Perception and Computing (CRIPAC), National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese
2.Center for Excellence in Brain Science and Intelligence Technology (CEBSIT)
3.University of Chinese Academy of Sciences (UCAS)
Recommended Citation
GB/T 7714
Wangli Hao,Zhaoxiang Zhang. Spatiotemporal Distilled Dense-Connectivity Network for Video Action Recognition[J]. Pattern Recognition,2019,10(20):100-130.
APA Wangli Hao,&Zhaoxiang Zhang.(2019).Spatiotemporal Distilled Dense-Connectivity Network for Video Action Recognition.Pattern Recognition,10(20),100-130.
MLA Wangli Hao,et al."Spatiotemporal Distilled Dense-Connectivity Network for Video Action Recognition".Pattern Recognition 10.20(2019):100-130.
Files in This Item:
File Name/Size DocType Version Access License
Spatiotemporal disti(2496KB)期刊论文作者接受稿开放获取CC BY-NC-SAView Download
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Wangli Hao]'s Articles
[Zhaoxiang Zhang]'s Articles
Baidu academic
Similar articles in Baidu academic
[Wangli Hao]'s Articles
[Zhaoxiang Zhang]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Wangli Hao]'s Articles
[Zhaoxiang Zhang]'s Articles
Terms of Use
No data!
Social Bookmark/Share
File name: Spatiotemporal distilled denseConnectivity network for video action recognition.pdf
Format: Adobe PDF
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.