Spatiotemporal Distilled Dense-Connectivity Network for Video Action Recognition

CASIA OpenIR > 模式识别国家重点实验室

	Spatiotemporal Distilled Dense-Connectivity Network for Video Action Recognition
	Wangli Hao; Zhaoxiang Zhang
发表期刊	Pattern Recognition
	2019
卷号	10 期号:20 页码:100-130
摘要	Two-stream convolutional neural networks show great promise for action recognition tasks. However, most two-stream based approaches train the appearance and motion subnetworks independently, which may lead to the decline in performance due to the lack of interactions among two streams. To overcome this limitation, we propose a Spatiotemporal Distilled Dense-Connectivity Network (STDDCN) for video action recognition. This network implements both knowledge distillation and dense-connectivity (adapted from DenseNet). Using this STDDCN architecture, we aim to explore interaction strategies between appearance and motion streams along different hierarchies. Specifically, block-level dense connections between appearance and motion pathways enable spatiotemporal interaction at the feature representation layers. Moreover, knowledge distillation among two streams (each treated as a student) and their last fusion (treated as teacher) allows both streams to interact at the high level layers. The special architecture of STDDCN allows it to gradually obtain effective hierarchical spatiotemporal features. Moreover, it can be trained end-to-end. Finally, numerous ablation studies validate the effectiveness and generalization of our model on two benchmark datasets, including UCF101 and HMDB51. Simultaneously, our model achieves promising performances.
关键词	Two-stream Action Recognition Dense-connectivity Knowledge Distillation
收录类别	SCI
七大方向——子方向分类	多模态智能
文献类型	期刊论文
条目标识符	http://ir.ia.ac.cn/handle/173211/23347
专题	模式识别国家重点实验室模式识别实验室
通讯作者	Zhaoxiang Zhang
作者单位	1.Center of Research on Intelligent Perception and Computing (CRIPAC), National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese 2.Center for Excellence in Brain Science and Intelligence Technology (CEBSIT) 3.University of Chinese Academy of Sciences (UCAS)
推荐引用方式 GB/T 7714	Wangli Hao,Zhaoxiang Zhang. Spatiotemporal Distilled Dense-Connectivity Network for Video Action Recognition[J]. Pattern Recognition,2019,10(20):100-130.
APA	Wangli Hao,&Zhaoxiang Zhang.(2019).Spatiotemporal Distilled Dense-Connectivity Network for Video Action Recognition.Pattern Recognition,10(20),100-130.
MLA	Wangli Hao,et al."Spatiotemporal Distilled Dense-Connectivity Network for Video Action Recognition".Pattern Recognition 10.20(2019):100-130.

条目包含的文件		下载所有文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
Spatiotemporal disti（2496KB）	期刊论文	作者接受稿	开放获取	CC BY-NC-SA	浏览下载