Image-Text Cross-Modal Retrieval via Modality-Specific Feature Learning
Jian Wang; Yonghao He; Cuicui Kang; Shiming Xiang; Chunhong Pan
2015
会议名称ACM International Conference on Multimedia Retrieval
会议日期2015-6
会议地点Shanghai, China
摘要Cross-modal retrieval extends the ability of search engines to deal with the massive cross-modal data. The goal of image-text cross-modal retrieval is to search images (texts) by using text (image) queries by computing the similarities of images and texts directly. Many existing methods rely on low-level visual features and textual features for cross-modal retrieval, ignoring the characteristics existing in the raw data of different modalities. In this paper, a novel model based on modality-specific feature learning is proposed. Considering the characteristics of different modalities, the model uses two types of convolutional neural networks to map the raw data to the latent space representations for images and texts, respectively. Particularly, the convolution based network used for texts involves word embedding learning, which has been proved effective to extract meaningful textual features for text classification. In the latent space, the mapped features of images and texts form relevant and irrelevant image-text pairs, which are used by the one-vs-more learning scheme. This learning scheme can achieve ranking functionality by allowing for one relevant and more irrelevant pairs. The standard back-propagation technique is employed to update the parameters of two convolutional networks. Extensive cross-modal retrieval experiments are carried out on three challenging datasets that consist of image-document pairs or image-query click-through data from a search engine, and the results firmly demonstrate that the proposed model is much more effective.
文献类型会议论文
条目标识符http://ir.ia.ac.cn/handle/173211/20369
专题模式识别国家重点实验室_先进数据分析与学习
作者单位National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences
推荐引用方式
GB/T 7714
Jian Wang,Yonghao He,Cuicui Kang,et al. Image-Text Cross-Modal Retrieval via Modality-Specific Feature Learning[C],2015.
条目包含的文件 下载所有文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
p347-wang.pdf(1411KB)会议论文 开放获取CC BY-NC-SA浏览 下载
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Jian Wang]的文章
[Yonghao He]的文章
[Cuicui Kang]的文章
百度学术
百度学术中相似的文章
[Jian Wang]的文章
[Yonghao He]的文章
[Cuicui Kang]的文章
必应学术
必应学术中相似的文章
[Jian Wang]的文章
[Yonghao He]的文章
[Cuicui Kang]的文章
相关权益政策
暂无数据
收藏/分享
文件名: p347-wang.pdf
格式: Adobe PDF
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。