Knowledge Commons of Institute of Automation,CAS
Densely Connected Attention Flow for Visual Question Answering | |
Liu, Fei1,2; Liu, Jing1,2; Fang, Zhiwei1,2; Hong, Richang3 | |
2019 | |
会议名称 | International Joint Conference on Artificial Intelligence (IJCAI) |
会议日期 | 2019-8 |
会议地点 | 中国澳门 |
出版者 | IJCAI |
摘要 | Learning effective interactions between multimodal features is at the heart of visual question answering (VQA). A common defect of the existing VQA approaches is that they only consider a very limited amount of interactions, which may be not enough to model latent complex imagequestion relations that are necessary for accurately answering questions. Therefore, in this paper, we propose a novel DCAF (Densely Connected Attention Flow) framework for modeling dense interactions. It densely connects all pairwise layers of the network via Attention Connectors, capturing fine-grained interplay between image and question across all hierarchical levels. The proposed Attention Connector efficiently connects the multi-modal features at any two layers with symmetric co-attention, and produces interaction-aware attention features. Experimental results on three publicly available datasets show that the proposed method achieves state-of-the-art performance. |
语种 | 英语 |
文献类型 | 会议论文 |
条目标识符 | http://ir.ia.ac.cn/handle/173211/48557 |
专题 | 紫东太初大模型研究中心_图像与视频分析 |
通讯作者 | Liu, Jing |
作者单位 | 1.National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences 2.University of Chinese Academy of Sciences 3.School of Computer and Information, Hefei University of Technology |
第一作者单位 | 模式识别国家重点实验室 |
通讯作者单位 | 模式识别国家重点实验室 |
推荐引用方式 GB/T 7714 | Liu, Fei,Liu, Jing,Fang, Zhiwei,et al. Densely Connected Attention Flow for Visual Question Answering[C]:IJCAI,2019. |
条目包含的文件 | 下载所有文件 | |||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 | ||
0122.pdf(681KB) | 会议论文 | 开放获取 | CC BY-NC-SA | 浏览 下载 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论