Three-Dimensional Attention-Based Deep Ranking Model for Video Highlight Detection
Jiao,Yifan1; Li,Zhetao2; Huang,Shucheng1; Yang,Xiaoshan3,4; Liu,Bin5; Zhang,Tianzhu3,4
发表期刊IEEE TRANSACTIONS ON MULTIMEDIA
2018-10
卷号20期号:10页码:2693-2705
摘要
The video highlight detection task is to localize key
elements (moments of user’s major or special interest) in a video.
Most of the existing highlight detection approaches extract features
from the video segment as a whole without considering the
difference of local features both temporally and spatially. Due to
the complexity of video content, this kind of mixed features will
impact the final highlight prediction. In temporal extent, not all
frames are worth watching because some of them only contain the
background of the environment without human or other moving
objects. In spatial extent, it is similar that not all regions in each
frame are highlights especially when there are lots of clutters in
the background. To solve the above problem, we propose a novel
three-dimensional (3-D) (spatial+temporal) attention model that
can automatically localize the key elements in a video without any
extra supervised annotations. Specifically, the proposed attention
model produces attention weights of local regions along both the
spatial and temporal dimensions of the video segment. The regions
of key elements in the video will be strengthened with large weights.
Thus, the more effective feature of the video segment is obtained to
predict the highlight score. The proposed 3-D attention scheme can
be easily integrated into a conventional end-to-end deep ranking
model that aims to learn a deep neural network to compute the
highlight score of each video segment. Extensive experimental
results on the YouTube and SumMe datasets demonstrate that the
proposed approach achieves significant improvement over state-of-
the-art methods. With the proposed 3-D attention model, video
highlights can be accurately retrieved in spatial and temporal
dimensions without human supervision in several domains, such
as gymnastics, parkour, skating, skiing, surfing, and dog activities,
on the public datasets.
关键词Video Highlight Detection Attention Model Deep Ranking
收录类别SCI
语种英语
WOS记录号WOS:000444903000013
引用统计
被引频次:42[WOS]   [WOS记录]     [WOS相关记录]
文献类型期刊论文
条目标识符http://ir.ia.ac.cn/handle/173211/22067
专题多模态人工智能系统全国重点实验室_多媒体计算
作者单位1.Jiangsu University of Science and Technology
2.College of Information Engineering, Xiangtan University
3.National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences
4.University of Chinese Academy of Sciences
5.Moshanghua Tech Company
推荐引用方式
GB/T 7714
Jiao,Yifan,Li,Zhetao,Huang,Shucheng,et al. Three-Dimensional Attention-Based Deep Ranking Model for Video Highlight Detection[J]. IEEE TRANSACTIONS ON MULTIMEDIA,2018,20(10):2693-2705.
APA Jiao,Yifan,Li,Zhetao,Huang,Shucheng,Yang,Xiaoshan,Liu,Bin,&Zhang,Tianzhu.(2018).Three-Dimensional Attention-Based Deep Ranking Model for Video Highlight Detection.IEEE TRANSACTIONS ON MULTIMEDIA,20(10),2693-2705.
MLA Jiao,Yifan,et al."Three-Dimensional Attention-Based Deep Ranking Model for Video Highlight Detection".IEEE TRANSACTIONS ON MULTIMEDIA 20.10(2018):2693-2705.
条目包含的文件 下载所有文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
Three-Dimensional At(4692KB)期刊论文作者接受稿开放获取CC BY-NC-SA浏览 下载
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Jiao,Yifan]的文章
[Li,Zhetao]的文章
[Huang,Shucheng]的文章
百度学术
百度学术中相似的文章
[Jiao,Yifan]的文章
[Li,Zhetao]的文章
[Huang,Shucheng]的文章
必应学术
必应学术中相似的文章
[Jiao,Yifan]的文章
[Li,Zhetao]的文章
[Huang,Shucheng]的文章
相关权益政策
暂无数据
收藏/分享
文件名: Three-Dimensional Attention-Based Deep Ranking.pdf
格式: Adobe PDF
此文件暂不支持浏览
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。