CASIA OpenIR  > 模式识别国家重点实验室  > 自然语言处理
Multi-modal Sentence Summarization with Modality Attention and Image Filtering
Li, Haoran; Zhu, Junnan; Liu, Tianshang; Zhang, Jiajun; Zong, Chengqing
2018
会议名称IJCAI
会议日期2018-7
会议地点Stockholm, Sweden
摘要

In this paper, we introduce a multi-modal sentence summarization task that produces a short summary from a pair of sentence and image. This task is more challenging than sentence summarization. It not only needs to effectively incorporate visual features into standard text summarization framework, but also requires to avoid noise of image. To this end, we propose a modality-based attention mechanism to pay different attention to image patches and text units, and we design image filters to selectively use visual information to enhance the semantics of the input sentence. We construct a multimodal sentence summarization dataset and extensive experiments on this dataset demonstrate that our models significantly outperform conventional models which only employ text as input. Further analyses suggest that sentence summarization task can benefit from visually grounded representations from a variety of aspects.

文献类型会议论文
条目标识符http://ir.ia.ac.cn/handle/173211/23203
专题模式识别国家重点实验室_自然语言处理
作者单位中国科学院自动化研究所
第一作者单位中国科学院自动化研究所
推荐引用方式
GB/T 7714
Li, Haoran,Zhu, Junnan,Liu, Tianshang,et al. Multi-modal Sentence Summarization with Modality Attention and Image Filtering[C],2018.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Li, Haoran]的文章
[Zhu, Junnan]的文章
[Liu, Tianshang]的文章
百度学术
百度学术中相似的文章
[Li, Haoran]的文章
[Zhu, Junnan]的文章
[Liu, Tianshang]的文章
必应学术
必应学术中相似的文章
[Li, Haoran]的文章
[Zhu, Junnan]的文章
[Liu, Tianshang]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。