CASIA OpenIR  > 模式识别实验室
Stacked Memory Network for Video Summarization
Wang, Junbo; Wang, Wei; Wang, Zhiyong; Wang, Liang; Feng, Dagan; Tan, Tieniu
Conference NameACM International Conference on Multimedia
Conference Date2019-10
Conference PlaceNice, France

In recent years, supervised video summarization has achieved promising progress with various recurrent neural networks (RNNs) based methods, which treats video summarization as a sequence-to-sequence learning problem to exploit temporal dependency among video frames across variable ranges. However, RNN has limitations in modelling the long-term temporal dependency for summarizing videos with thousands of frames due to the restricted memory storage unit. Therefore, in this paper we propose a stacked memory network called SMN to explicitly model the long dependency among video frames so that redundancy could be minimized in the video summaries produced. Our proposed SMN consists of two key components: Long Short-Term Memory (LSTM) layer and memory layer, where each LSTM layer is augmented with an external memory layer. In particular, we stack multiple LSTM layers and memory layers hierarchically to integrate the learned representation from prior layers. By combining the hidden states of the LSTM layers and the read representations of the memory layers, our SMN is able to derive more accurate video summaries for individual video frames. Compared with the existing RNN based methods, our SMN is particularly good at capturing long temporal dependency among frames with few additional training parameters. Experimental results on two widely used public benchmark datasets: SumMe and TVsum, demonstrate that our proposed model is able to clearly outperform a number of state-of-the-art ones under various settings.

Sub direction classification图像视频处理与分析
Document Type会议论文
Affiliation1.Center for Research on Intelligent Perception and Computing, National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences
2.University of Chinese Academy of Sciences
3.School of Computer Science, The University of Sydney
Recommended Citation
GB/T 7714
Wang, Junbo,Wang, Wei,Wang, Zhiyong,et al. Stacked Memory Network for Video Summarization[C],2019.
Files in This Item:
File Name/Size DocType Version Access License
MM2019.pdf(700KB)会议论文 开放获取CC BY-NC-SAView
Related Services
Recommend this item
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Wang, Junbo]'s Articles
[Wang, Wei]'s Articles
[Wang, Zhiyong]'s Articles
Baidu academic
Similar articles in Baidu academic
[Wang, Junbo]'s Articles
[Wang, Wei]'s Articles
[Wang, Zhiyong]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Wang, Junbo]'s Articles
[Wang, Wei]'s Articles
[Wang, Zhiyong]'s Articles
Terms of Use
No data!
Social Bookmark/Share
File name: MM2019.pdf
Format: Adobe PDF
This file does not support browsing at this time
All comments (0)
No comment.

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.