Non-Autoregressive Image Captioning with Counterfactuals-Critical Multi-Agent Learning
Guo LT(郭龙腾)1,2; Liu J(刘静)1; Zhu XX(朱欣鑫)1; He XJ(何兴建)1,2; Jiang J(江洁)1,2; Lu HQ(卢汉清)1
2020
会议名称IJCAI
会议日期2021.01.07
会议地点日本横滨
摘要

Most image captioning models are autoregressive, i.e. they generate each word by conditioning on previously generated words, which leads to heavy latency during inference. Recently, non-autoregressive decoding has been proposed in machine translation to speed up the inference time by generating all words in parallel. Typically, these models use the word-level cross-entropy loss to optimize each word independently. However, such a learning process fails to consider the sentence-level consistency, thus resulting in inferior generation quality of these non-autoregressive models. In this paper, we propose a Non-Autoregressive Image Captioning (NAIC) model with a novel training paradigm: Counterfactuals-critical Multi-Agent Learning (CMAL). CMAL formulates NAIC as a multi-agent reinforcement learning system where positions in the target sequence are viewed as agents that learn to cooperatively maximize a sentence-level reward. Besides, we propose to utilize massive unlabeled images to boost captioning performance. Extensive experiments on MSCOCO image captioning benchmark show that our NAIC model achieves a performance comparable to state-of-the-art autoregressive models, while brings 13.9x decoding speedup.

七大方向——子方向分类多模态智能
文献类型会议论文
条目标识符http://ir.ia.ac.cn/handle/173211/44986
专题紫东太初大模型研究中心_图像与视频分析
中国科学院自动化研究所
通讯作者Liu J(刘静)
作者单位1.National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences
2.School of Artificial Intelligence, University of Chinese Academy of Sciences
第一作者单位模式识别国家重点实验室
通讯作者单位模式识别国家重点实验室
推荐引用方式
GB/T 7714
Guo LT,Liu J,Zhu XX,et al. Non-Autoregressive Image Captioning with Counterfactuals-Critical Multi-Agent Learning[C],2020.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
[IJCAI 2020] Non-Aut(434KB)会议论文 开放获取CC BY-NC-SA浏览
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Guo LT(郭龙腾)]的文章
[Liu J(刘静)]的文章
[Zhu XX(朱欣鑫)]的文章
百度学术
百度学术中相似的文章
[Guo LT(郭龙腾)]的文章
[Liu J(刘静)]的文章
[Zhu XX(朱欣鑫)]的文章
必应学术
必应学术中相似的文章
[Guo LT(郭龙腾)]的文章
[Liu J(刘静)]的文章
[Zhu XX(朱欣鑫)]的文章
相关权益政策
暂无数据
收藏/分享
文件名: [IJCAI 2020] Non-Autoregressive Image Captioning with Counterfactuals-Critical Multi-Agent Learning.pdf
格式: Adobe PDF
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。