CASIA OpenIR  > 智能感知与计算研究中心
Long video question answering: A Matching-guided Attention Model
Wang, Weining1,2; Huang, Yan1,2; Wang, Liang1,2,3
发表期刊PATTERN RECOGNITION
ISSN0031-3203
2020-06-01
卷号102期号:1页码:11
摘要

Existing video question answering methods answer given questions based on short video snippets. The underlying assumption is that the visual content indicating the ground truth answer ubiquitously exists in the snippet. It might be problematic for long video applications, since involving large numbers of answer-irrelevant snippets will dramatically degenerate the performance. To deal with this issue, we focus on a rarely investigated but practically important problem, namely long video QA, by predicting answers directly from long videos rather than manually pre-extracted short video snippets. We accordingly propose a Matching-guided Attention Model (MAM) which jointly extracts question-related video snippets and predicts answers in a unified framework. To localize questions accurately and efficiently, we calculate corresponding matching scores and boundary regression results for candidate video snippet proposals generated by sliding windows of limited granularity. Guided by the matching scores, the model pays different attention to the extracted video snippet proposals for each question. Finally, we use the attended visual features along with the question to predict the answer in a classification manner. A key obstacle to training our model is that publicly available video QA datasets only contain short videos especially designed for short video QA. Thus, we generate two new datasets for this task on the top of TACoS Multi-level dataset and MSR-VTT dataset by generating QA pairs from the video captions, called TACoS-QA and MSR-VTT-QA. Experimental results show the effectiveness of our proposed method on both datasets by comparing with two short video QA methods and a baseline method. (C) 2020 Elsevier Ltd. All rights reserved.

关键词Long video QA Matching-guided attention
DOI10.1016/j.patcog.2020.107248
关键词[WOS]NETWORK ; IMAGE
收录类别SCI
语种英语
资助项目National Key Research and Development Program of China[2016YFB1001000] ; National Key Research and Development Program of China[2018AAA0100402] ; National Natural Science Foundation of China[61525306] ; National Natural Science Foundation of China[61633021] ; National Natural Science Foundation of China[61721004] ; National Natural Science Foundation of China[61420106015] ; National Natural Science Foundation of China[61806194] ; National Natural Science Foundation of China[U1803261] ; National Natural Science Foundation of China[61976132] ; Capital Science and Technology Leading Talent Training Project[Z181100006318030] ; CAS-AIR ; [HW2019SOW01]
项目资助者National Key Research and Development Program of China ; National Natural Science Foundation of China ; Capital Science and Technology Leading Talent Training Project ; CAS-AIR
WOS研究方向Computer Science ; Engineering
WOS类目Computer Science, Artificial Intelligence ; Engineering, Electrical & Electronic
WOS记录号WOS:000525825100029
出版者ELSEVIER SCI LTD
七大方向——子方向分类多模态智能
引用统计
被引频次:13[WOS]   [WOS记录]     [WOS相关记录]
文献类型期刊论文
条目标识符http://ir.ia.ac.cn/handle/173211/38878
专题智能感知与计算研究中心
通讯作者Wang, Liang
作者单位1.Chinese Acad Sci, Ctr Res Intelligent Percept & Comp, Natl Lab Pattern Recognit, Inst Automat, Beijing 100190, Peoples R China
2.Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China
3.Chinese Acad Sci, Ctr Excellence Brain Sci & Intelligence Technol, Inst Automat, Beijing 100190, Peoples R China
第一作者单位模式识别国家重点实验室
通讯作者单位模式识别国家重点实验室
推荐引用方式
GB/T 7714
Wang, Weining,Huang, Yan,Wang, Liang. Long video question answering: A Matching-guided Attention Model[J]. PATTERN RECOGNITION,2020,102(1):11.
APA Wang, Weining,Huang, Yan,&Wang, Liang.(2020).Long video question answering: A Matching-guided Attention Model.PATTERN RECOGNITION,102(1),11.
MLA Wang, Weining,et al."Long video question answering: A Matching-guided Attention Model".PATTERN RECOGNITION 102.1(2020):11.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
PR.pdf(1963KB)期刊论文作者接受稿开放获取CC BY-NC-SA浏览
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Wang, Weining]的文章
[Huang, Yan]的文章
[Wang, Liang]的文章
百度学术
百度学术中相似的文章
[Wang, Weining]的文章
[Huang, Yan]的文章
[Wang, Liang]的文章
必应学术
必应学术中相似的文章
[Wang, Weining]的文章
[Huang, Yan]的文章
[Wang, Liang]的文章
相关权益政策
暂无数据
收藏/分享
文件名: PR.pdf
格式: Adobe PDF
此文件暂不支持浏览
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。