CASIA OpenIR  > 模式识别国家重点实验室  > 生物识别与安全技术
Context-Aware Attention Network for Image-Text Retrieval
Qi Zhang1,2; Zhen Lei1,2; Zhaoxiang Zhang1,2; Stan Z. Li3
2020-06-14
Conference NameIEEE Conference on Computer Vision and Pattern Recognition
Conference Date2020-6-14
Conference PlaceSeattle, Washington, USA
Abstract

As a typical cross-modal problem, image-text bidirectional retrieval relies heavily on the joint embedding learning and similarity measure for each image-text pair. It remains challenging because prior works seldom explore semantic correspondences between modalities and semantic correlations in a single modality at the same time. In this work, we propose a unified Context-Aware Attention Network (CAAN), which selectively focuses on critical local fragments (regions and words) by aggregating the global context. Specifically, it simultaneously utilizes global intermodal alignments and intra-modal correlations to discover latent semantic relations. Considering the interactions between images and sentences in the retrieval process, intramodal correlations are derived from the second-order attention of region-word alignments instead of intuitively comparing the distance between original features. Our method achieves fairly competitive results on two generic image-text retrieval datasets Flickr30K and MS-COCO.

Indexed ByEI
Document Type会议论文
Identifierhttp://ir.ia.ac.cn/handle/173211/39252
Collection模式识别国家重点实验室_生物识别与安全技术
Corresponding AuthorZhen Lei
Affiliation1.NLPR, Institute of Automation, Chinese Academy of Sciences, Beijing, China
2.Center for AI Research and Innovation, Westlake University, Hangzhou, China
3.School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China
First Author AffilicationChinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China
Corresponding Author AffilicationChinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China
Recommended Citation
GB/T 7714
Qi Zhang,Zhen Lei,Zhaoxiang Zhang,et al. Context-Aware Attention Network for Image-Text Retrieval[C],2020.
Files in This Item: Download All
File Name/Size DocType Version Access License
PID6410551.pdf(3229KB)会议论文 开放获取CC BY-NC-SAView Download
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Qi Zhang]'s Articles
[Zhen Lei]'s Articles
[Zhaoxiang Zhang]'s Articles
Baidu academic
Similar articles in Baidu academic
[Qi Zhang]'s Articles
[Zhen Lei]'s Articles
[Zhaoxiang Zhang]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Qi Zhang]'s Articles
[Zhen Lei]'s Articles
[Zhaoxiang Zhang]'s Articles
Terms of Use
No data!
Social Bookmark/Share
File name: PID6410551.pdf
Format: Adobe PDF
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.