Adaptive Attention Annotation Model: Optimizing the Prediction Path through Dependency Fusion
Wang, Fangxin1,2; Liu, Jie1; Zhang, Shuwu1,3; Zhang, Guixuan1; Zheng, Yang1; Li, Xiaoqian1,2; Liang, Wei1; Li, Yuejun1,2
Source PublicationKSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS
ISSN1976-7277
2019-09-30
Volume13Issue:9Pages:4665-4683
Corresponding AuthorLiu, Jie(jie.liu@ia.ac.cn)
AbstractPrevious methods build image annotation model by leveraging three basic dependencies: relations between image and label (image/label), between images (image/image) and between labels (label/label). Even though plenty of researches show that multiple dependencies can work jointly to improve annotation performance, different dependencies actually do not "work jointly" in their diagram, whose performance is largely depending on the result predicted by image/label section. To address this problem, we propose the adaptive attention annotation model (AAAM) to associate these dependencies with the prediction path, which is composed of a series of labels (tags) in the order they are detected. In particular, we optimize the prediction path by detecting the relevant labels from the easy-to-detect to the hard-to-detect, which are found using Binary Cross-Entropy (BCE) and Triplet Margin (TM) losses, respectively. Besides, in order to capture the inforamtion of each label, instead of explicitly extracting regional featutres, we propose the self-attention machanism to implicitly enhance the relevant region and restrain those irrelevant. To validate the effective of the model, we conduct experiments on three well-known public datasets, COCO 2014, IAPR TC-12 and NUSWIDE, and achieve better performance than the state-of-the-art methods.
Keywordimage annotation multiple dependencies self-attention prediction path Triplet Margin loss
DOI10.3837/tiis.2019.09.019
WOS KeywordAUTOMATIC IMAGE ANNOTATION
Indexed BySCI
Language英语
Funding ProjectNational Key R&D Program of China[2017YFB1401000] ; Key Laboratory of Digital Rights Services, is one of the National Science and Standardization Key Labs for Press and Publication Industry
Funding OrganizationNational Key R&D Program of China ; Key Laboratory of Digital Rights Services, is one of the National Science and Standardization Key Labs for Press and Publication Industry
WOS Research AreaComputer Science ; Telecommunications
WOS SubjectComputer Science, Information Systems ; Telecommunications
WOS IDWOS:000488294100019
PublisherKSII-KOR SOC INTERNET INFORMATION
Citation statistics
Document Type期刊论文
Identifierhttp://ir.ia.ac.cn/handle/173211/26114
Collection数字内容技术与服务研究中心_新媒体服务与管理技术
Corresponding AuthorLiu, Jie
Affiliation1.Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China
2.Univ Chinese Acad Sci, Beijing 100049, Peoples R China
3.Beijing Film Acad, AICFVE, Beijing 100088, Peoples R China
First Author AffilicationInstitute of Automation, Chinese Academy of Sciences
Corresponding Author AffilicationInstitute of Automation, Chinese Academy of Sciences
Recommended Citation
GB/T 7714
Wang, Fangxin,Liu, Jie,Zhang, Shuwu,et al. Adaptive Attention Annotation Model: Optimizing the Prediction Path through Dependency Fusion[J]. KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS,2019,13(9):4665-4683.
APA Wang, Fangxin.,Liu, Jie.,Zhang, Shuwu.,Zhang, Guixuan.,Zheng, Yang.,...&Li, Yuejun.(2019).Adaptive Attention Annotation Model: Optimizing the Prediction Path through Dependency Fusion.KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS,13(9),4665-4683.
MLA Wang, Fangxin,et al."Adaptive Attention Annotation Model: Optimizing the Prediction Path through Dependency Fusion".KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS 13.9(2019):4665-4683.
Files in This Item: Download All
File Name/Size DocType Version Access License
TIIS Vol 13, No 9-19(1061KB)期刊论文作者接受稿开放获取CC BY-NC-SAView Download
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Wang, Fangxin]'s Articles
[Liu, Jie]'s Articles
[Zhang, Shuwu]'s Articles
Baidu academic
Similar articles in Baidu academic
[Wang, Fangxin]'s Articles
[Liu, Jie]'s Articles
[Zhang, Shuwu]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Wang, Fangxin]'s Articles
[Liu, Jie]'s Articles
[Zhang, Shuwu]'s Articles
Terms of Use
No data!
Social Bookmark/Share
File name: TIIS Vol 13, No 9-19.pdf
Format: Adobe PDF
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.