A Two-Stream Hybrid Convolution-Transformer Network Architecture for Clothing-Change Person Re-Identification | |
Wu, Junyi1; Huang, Yan2; Gao, Min3; Gao, Zhipeng1; Zhao, Jianqiang1; Zhang, Huiji1; Zhang, Anguo4,5,6,7 | |
发表期刊 | IEEE TRANSACTIONS ON MULTIMEDIA |
ISSN | 1520-9210 |
2024 | |
卷号 | 26页码:5326-5339 |
通讯作者 | Huang, Yan(yan.huang@cripac.ia.ac.cn) ; Zhang, Anguo(anguo.zhang@hotmail.com) |
摘要 | Long-term (also called Clothing-Change) person re-identification (CC-reID) aims at confirming the identity of pedestrians captured at diverse locations and/or times. Current CC-reID methods heavily rely on ID features learned by the CNN architecture. However, with limited receptive fields, CNN is hard to effectively explore some unique but discriminative ID features (e.g., hair style, tattoo and accessories) from small body regions. Compared with CNN, Transformer has certain merits in exploring more diverse ID-unique features and retaining more details by the multi-head self-attention design and the removal of down-sampling operation. In this paper, a two-stream hybrid Convolution-Transformer Network (CT-Net) is proposed for CC-reID by combining both CNN and Transformer parallelly in an end-to-end learning scheme. Specifically, CT-Net contains a CNN-based stream (C-Stream) and a Transformer-based stream (T-Stream). Compared with using C-Stream only, T-Stream is used to encourage the C-Stream to explore more detailed ID-unique features when the clothing information is no reliable in CC-reID. Specifically, a Feature Supplement Module (FSM) is proposed to transfer features learned by T-Stream to C-Stream from low-level to high-level for mining more ID-unique feature. In order to further enhance the discriminability and complementary of ID features learned by our CT-Net, we also introduce a hierarchical supervision with bilinear pooling (HSBP). Experimental results demonstrate that CT-Net performs favorably against the state-of-the-art methods over three CC-reID benchmarks. Meanwhile, CT-Net also demonstrates good generalization ability by achieving comparable performance on traditional person re-ID datasets such as Market-1501 and DukeMTMC-reID. |
关键词 | Clothing-change person re-identification ID-unique feature feature supplement module hierarchical supervision |
DOI | 10.1109/TMM.2023.3331569 |
关键词[WOS] | RECOGNITION ; ATTENTION |
收录类别 | SCI |
语种 | 英语 |
资助项目 | National Natural Science Foundation of China |
项目资助者 | National Natural Science Foundation of China |
WOS研究方向 | Computer Science ; Telecommunications |
WOS类目 | Computer Science, Information Systems ; Computer Science, Software Engineering ; Telecommunications |
WOS记录号 | WOS:001189435600030 |
出版者 | IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC |
引用统计 | |
文献类型 | 期刊论文 |
条目标识符 | http://ir.ia.ac.cn/handle/173211/58121 |
专题 | 模式识别实验室 |
通讯作者 | Huang, Yan; Zhang, Anguo |
作者单位 | 1.Xiamen Meiya Pico Informat Co Ltd, Xiamen Meiya Informat Secur Res Inst Co Ltd, AI Res Ctr, Xiamen 361008, Peoples R China 2.Chinese Acad Sci, Inst Automat, Ctr Res Intelligent Percept & Comp, Natl Lab Pattern Recognit, Beijing 100045, Peoples R China 3.Fuzhou Univ, Coll Phys & Informat Engn, Fujian Key Lab Intelligent Proc & Wireless Transmi, Fuzhou 350025, Peoples R China 4.Anhui Univ, Sch Artificial Intelligence, Hefei 230039, Peoples R China 5.Minist Educ, Res Ctr Autonomous Unmanned Syst Technol, Hefei 230039, Peoples R China 6.Anhui Prov Engn Res Ctr Unmanned Syst & Intelligen, Hefei 230039, Peoples R China 7.Univ Macau, Inst Microelect, Taipa 999078, Macao, Peoples R China |
通讯作者单位 | 模式识别国家重点实验室 |
推荐引用方式 GB/T 7714 | Wu, Junyi,Huang, Yan,Gao, Min,et al. A Two-Stream Hybrid Convolution-Transformer Network Architecture for Clothing-Change Person Re-Identification[J]. IEEE TRANSACTIONS ON MULTIMEDIA,2024,26:5326-5339. |
APA | Wu, Junyi.,Huang, Yan.,Gao, Min.,Gao, Zhipeng.,Zhao, Jianqiang.,...&Zhang, Anguo.(2024).A Two-Stream Hybrid Convolution-Transformer Network Architecture for Clothing-Change Person Re-Identification.IEEE TRANSACTIONS ON MULTIMEDIA,26,5326-5339. |
MLA | Wu, Junyi,et al."A Two-Stream Hybrid Convolution-Transformer Network Architecture for Clothing-Change Person Re-Identification".IEEE TRANSACTIONS ON MULTIMEDIA 26(2024):5326-5339. |
条目包含的文件 | 条目无相关文件。 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论