CASIA OpenIR  > 智能感知与计算研究中心
Joint Token and Feature Alignment Framework for Text-Based Person Search
Li, Shangze1; Lu, Andong1; Huang, Yan3; Li, Chenglong2; Wang, Liang3
发表期刊IEEE SIGNAL PROCESSING LETTERS
ISSN1070-9908
2022
卷号29页码:2238-2242
通讯作者Li, Chenglong(lcl1314@foxmail.com)
摘要Text-based person search is a challenging cross-modal retrieval task. Existing works reduce the inter-modality and intra-class gaps by aligning local features extracted from image and text modalities, which easily lead to mismatching problems due to the lack of annotation information. Besides, it is sub-optimal to reduce two gaps simultaneously in the same feature space. This work proposes a novel joint token and feature alignment framework to reduce the inter-modality and intra-class gaps progressively. Specifically, we first build a dual-path feature learning network to extract features and conduct feature alignment to reduce the inter-modality gap. Second, we design a text generation module to generate token sequences using visual features, and then token alignment is performed to reduce the intra-class gap. Last, a fusion interaction module is introduced to further eliminate the modality heterogeneity using the strategy of multi-stage feature fusion. Extensive experiments on the CUHK-PEDES dataset demonstrate the effectiveness of our model, which significantly outperforms previous state-of-the-art methods.
关键词Feature extraction Visualization Representation learning Logic gates Image reconstruction Transformers Training Cross-modal generation feature alignment text-based person search token alignment transformer
DOI10.1109/LSP.2022.3217682
收录类别SCI
语种英语
资助项目National Natural Science Foundation of China[61976003] ; National Natural Science Foundation of China[62076003] ; Anhui Provincial Key Research and Development Program[202104d07020008] ; Open Project Program of the National Laboratory of Pattern Recognition (NLPR) ; Gaofeng Discipline Construction Project (Computer Science and Technology)[Z010111016]
项目资助者National Natural Science Foundation of China ; Anhui Provincial Key Research and Development Program ; Open Project Program of the National Laboratory of Pattern Recognition (NLPR) ; Gaofeng Discipline Construction Project (Computer Science and Technology)
WOS研究方向Engineering
WOS类目Engineering, Electrical & Electronic
WOS记录号WOS:000880641600004
出版者IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
引用统计
被引频次:2[WOS]   [WOS记录]     [WOS相关记录]
文献类型期刊论文
条目标识符http://ir.ia.ac.cn/handle/173211/50679
专题智能感知与计算研究中心
通讯作者Li, Chenglong
作者单位1.Anhui Univ, Sch Comp Sci & Technol, Anhui Prov Key Lab Multimodal Cognit Computat, Hefei 230601, Peoples R China
2.Anhui Univ, Sch Artificial Intelligence, Informat Mat & Intelligent Sensing Lab Anhui Prov, Anhui Prov Key Lab Multimodal Cognit Computat, Hefei 230601, Peoples R China
3.Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China
推荐引用方式
GB/T 7714
Li, Shangze,Lu, Andong,Huang, Yan,et al. Joint Token and Feature Alignment Framework for Text-Based Person Search[J]. IEEE SIGNAL PROCESSING LETTERS,2022,29:2238-2242.
APA Li, Shangze,Lu, Andong,Huang, Yan,Li, Chenglong,&Wang, Liang.(2022).Joint Token and Feature Alignment Framework for Text-Based Person Search.IEEE SIGNAL PROCESSING LETTERS,29,2238-2242.
MLA Li, Shangze,et al."Joint Token and Feature Alignment Framework for Text-Based Person Search".IEEE SIGNAL PROCESSING LETTERS 29(2022):2238-2242.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Li, Shangze]的文章
[Lu, Andong]的文章
[Huang, Yan]的文章
百度学术
百度学术中相似的文章
[Li, Shangze]的文章
[Lu, Andong]的文章
[Huang, Yan]的文章
必应学术
必应学术中相似的文章
[Li, Shangze]的文章
[Lu, Andong]的文章
[Huang, Yan]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。