Adaptive Attention Annotation Model: Optimizing the Prediction Path through Dependency Fusion

doi:10.3837/tiis.2019.09.019

CASIA OpenIR > 数字内容技术与服务研究中心 > 版权智能与文化计算

	Adaptive Attention Annotation Model: Optimizing the Prediction Path through Dependency Fusion
	Wang, Fangxin1,2 ; Liu, Jie1 ; Zhang, Shuwu1,3 ; Zhang, Guixuan1 ; Zheng, Yang1 ; Li, Xiaoqian1,2 ; Liang, Wei1 ; Li, Yuejun1,2
发表期刊	KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS
ISSN	1976-7277
	2019-09-30
卷号	13 期号:9 页码:4665-4683
通讯作者	Liu, Jie(jie.liu@ia.ac.cn)
摘要	Previous methods build image annotation model by leveraging three basic dependencies: relations between image and label (image/label), between images (image/image) and between labels (label/label). Even though plenty of researches show that multiple dependencies can work jointly to improve annotation performance, different dependencies actually do not "work jointly" in their diagram, whose performance is largely depending on the result predicted by image/label section. To address this problem, we propose the adaptive attention annotation model (AAAM) to associate these dependencies with the prediction path, which is composed of a series of labels (tags) in the order they are detected. In particular, we optimize the prediction path by detecting the relevant labels from the easy-to-detect to the hard-to-detect, which are found using Binary Cross-Entropy (BCE) and Triplet Margin (TM) losses, respectively. Besides, in order to capture the inforamtion of each label, instead of explicitly extracting regional featutres, we propose the self-attention machanism to implicitly enhance the relevant region and restrain those irrelevant. To validate the effective of the model, we conduct experiments on three well-known public datasets, COCO 2014, IAPR TC-12 and NUSWIDE, and achieve better performance than the state-of-the-art methods.
关键词	image annotation multiple dependencies self-attention prediction path Triplet Margin loss
DOI	10.3837/tiis.2019.09.019
关键词[WOS]	AUTOMATIC IMAGE ANNOTATION
收录类别	SCI
语种	英语
资助项目	Key Laboratory of Digital Rights Services, is one of the National Science and Standardization Key Labs for Press and Publication Industry ; National Key R&D Program of China[2017YFB1401000] ; National Key R&D Program of China[2017YFB1401000] ; Key Laboratory of Digital Rights Services, is one of the National Science and Standardization Key Labs for Press and Publication Industry
项目资助者	National Key R&D Program of China ; Key Laboratory of Digital Rights Services, is one of the National Science and Standardization Key Labs for Press and Publication Industry
WOS研究方向	Computer Science ; Telecommunications
WOS类目	Computer Science, Information Systems ; Telecommunications
WOS记录号	WOS:000488294100019
出版者	KSII-KOR SOC INTERNET INFORMATION
七大方向——子方向分类	多模态智能
引用统计
文献类型	期刊论文
条目标识符	http://ir.ia.ac.cn/handle/173211/26114
专题	数字内容技术与服务研究中心_版权智能与文化计算
通讯作者	Liu, Jie
作者单位	1.Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China 2.Univ Chinese Acad Sci, Beijing 100049, Peoples R China 3.Beijing Film Acad, AICFVE, Beijing 100088, Peoples R China
第一作者单位	中国科学院自动化研究所
通讯作者单位	中国科学院自动化研究所
推荐引用方式 GB/T 7714	Wang, Fangxin,Liu, Jie,Zhang, Shuwu,et al. Adaptive Attention Annotation Model: Optimizing the Prediction Path through Dependency Fusion[J]. KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS,2019,13(9):4665-4683.
APA	Wang, Fangxin.,Liu, Jie.,Zhang, Shuwu.,Zhang, Guixuan.,Zheng, Yang.,...&Li, Yuejun.(2019).Adaptive Attention Annotation Model: Optimizing the Prediction Path through Dependency Fusion.KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS,13(9),4665-4683.
MLA	Wang, Fangxin,et al."Adaptive Attention Annotation Model: Optimizing the Prediction Path through Dependency Fusion".KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS 13.9(2019):4665-4683.

条目包含的文件		下载所有文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
TIIS Vol 13, No 9-19（1061KB）	期刊论文	作者接受稿	开放获取	CC BY-NC-SA	浏览下载