Zero-Shot Predicate Prediction for Scene Graph Parsing
Li, Yiming1; Yang, Xiaoshan2,3,4; Huang, Xuhui5; Ma, Zhe5; Xu, Changsheng2,3,4
发表期刊IEEE TRANSACTIONS ON MULTIMEDIA
ISSN1520-9210
2023
卷号25页码:3140-3153
通讯作者Xu, Changsheng(csxu@nlpr.ia.ac.cn)
摘要The scene graph is a structured semantic representation of an image, which represents objects and relationships with vertices and edges, respectively. Since it is impossible to manually label all potential relationships in the real world, some previous methods try to apply the zero-shot method for scene graph generation. However, existing methods take triplet (i.e., (subject -predicate -object)) as the basic unit of a relationship. Each element (i.e., subject, predicate, or object) of the unseen relationship is actually seen in the training data. Therefore, they ignore the unseen predicate. To predict the unseen predicate, we introduce a novel task named zero-shot predicate prediction, which is crucial to extending existing scene graph generation methods to recognize more relationship classes. The new task is challenging and cannot be simply resolved through conventional zero-shot learning methods because there is a large intra-class variation of each predicate. Firstly, the large intra-class variation leads to the difficulty of computing the discriminative instance-level feature of the predicate class. Secondly, the large intra-class variation also brings more difficulties when knowledge is transferred from seen classes to unseen classes. For the first challenge, we propose distilling lexical knowledge of different objects and construct multi-modal representations of pairwise objects to reduce the intra-class variation of the predicate. To respond to the second challenge, we build a compact semantic space where the representations of unseen classes are reconstructed based on the seen classes for zero-shot predicate classification. We evaluate the proposed method on the public dataset Visual Genome. The extensive experiment results under the zero-shot/few-shot/supervised settings demonstrate the effectiveness of the proposed method.
关键词Deep learning zero-shot scene graph
DOI10.1109/TMM.2022.3155928
收录类别SCI
语种英语
资助项目National Key Research and Development Program of China[2018AAA0100604] ; National Natural Science Foundation of China[61720106006] ; National Natural Science Foundation of China[62036012] ; National Natural Science Foundation of China[61721004] ; National Natural Science Foundation of China[62072455] ; National Natural Science Foundation of China[U1836220] ; National Natural Science Foundation of China[U1705262] ; National Natural Science Foundation of China[61872424] ; Key Research Program of Frontier Sciences of CAS[QYZDJ-SSW-JSC039] ; Beijing Natural Science Foundation[L201001]
项目资助者National Key Research and Development Program of China ; National Natural Science Foundation of China ; Key Research Program of Frontier Sciences of CAS ; Beijing Natural Science Foundation
WOS研究方向Computer Science ; Telecommunications
WOS类目Computer Science, Information Systems ; Computer Science, Software Engineering ; Telecommunications
WOS记录号WOS:001045742200015
出版者IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
引用统计
文献类型期刊论文
条目标识符http://ir.ia.ac.cn/handle/173211/54027
专题多模态人工智能系统全国重点实验室
通讯作者Xu, Changsheng
作者单位1.Zhengzhou Univ, Sch Informat Engn, Zhengzhou 450001, Peoples R China
2.Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China
3.Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China
4.PengCheng Lab, Shenzhen 518066, Peoples R China
5.CASIC, Acad 2, Lab 10, Beijing 100854, Peoples R China
通讯作者单位模式识别国家重点实验室
推荐引用方式
GB/T 7714
Li, Yiming,Yang, Xiaoshan,Huang, Xuhui,et al. Zero-Shot Predicate Prediction for Scene Graph Parsing[J]. IEEE TRANSACTIONS ON MULTIMEDIA,2023,25:3140-3153.
APA Li, Yiming,Yang, Xiaoshan,Huang, Xuhui,Ma, Zhe,&Xu, Changsheng.(2023).Zero-Shot Predicate Prediction for Scene Graph Parsing.IEEE TRANSACTIONS ON MULTIMEDIA,25,3140-3153.
MLA Li, Yiming,et al."Zero-Shot Predicate Prediction for Scene Graph Parsing".IEEE TRANSACTIONS ON MULTIMEDIA 25(2023):3140-3153.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Li, Yiming]的文章
[Yang, Xiaoshan]的文章
[Huang, Xuhui]的文章
百度学术
百度学术中相似的文章
[Li, Yiming]的文章
[Yang, Xiaoshan]的文章
[Huang, Xuhui]的文章
必应学术
必应学术中相似的文章
[Li, Yiming]的文章
[Yang, Xiaoshan]的文章
[Huang, Xuhui]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。