Attribute-Guided Cross-Modal Interaction and Enhancement for Audio-Visual Matching
Wang, Jiaxiang1; Zheng, Aihua2,3; Yan, Yan4; He, Ran5,6,7; Tang, Jin1
发表期刊IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY
ISSN1556-6013
2024
卷号19页码:4986-4998
通讯作者Zheng, Aihua(ahzheng214@foxmail.com)
摘要Audio-visual matching is an essential task that measures the correlation between audio clips and visual images. However, current methods rely solely on the joint embedding of global features from audio clips and face image pairs to learn semantic correlations. This approach overlooks the importance of high-confidence correlations and discrepancies of local subtle features, which are crucial for cross-modal matching. To address this issue, we propose a novel Attribute-guided Cross-modal Interaction and Enhancement Network (ACIENet), which employs multiple attributes to explore the associations of different key local subtle features. The ACIENet contains two novel modules: the Attribute-guided Interaction (AGI) module and the Attribute-guided Enhancement (AGE) module. The AGI module employs global feature alignment similarity to guide cross-modal local feature interactions, which enhances cross-modal association features for the same identity and expands cross-modal distinctive features for different identities. Additionally, the interactive features and original features are fused to ensure intra-class discriminability and inter-class correspondence. The AGE module captures subtle attribute-related features by using an attribute-driven network, thereby enhancing discrimination at the attribute level. Specifically, it strengthens the combined attribute-related features of gender and nationality. To prevent interference between multiple attribute features, we design a multi-attribute learning network as a parallel framework. Experiments conducted on a public benchmark dataset demonstrate the efficacy of the ACIENet method in different scenarios. Code and models are available at https://github.com/w1018979952/ACIENet.
关键词Audio-visual cross-modal matching attribute-guided cross-modal interaction attribute-guided cross-modal enhancement
DOI10.1109/TIFS.2024.3388949
关键词[WOS]FACE ; VOICE ; RECOGNITION ; IDENTITY
收录类别SCI
语种英语
资助项目National Natural Science Foundation of China
项目资助者National Natural Science Foundation of China
WOS研究方向Computer Science ; Engineering
WOS类目Computer Science, Theory & Methods ; Engineering, Electrical & Electronic
WOS记录号WOS:001216477200006
出版者IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
引用统计
文献类型期刊论文
条目标识符http://ir.ia.ac.cn/handle/173211/58361
专题多模态人工智能系统全国重点实验室
通讯作者Zheng, Aihua
作者单位1.Anhui Univ, Sch Comp Sci & Technol, Anhui Prov Key Lab Multimodal Cognit Computat, Hefei 230601, Peoples R China
2.Anhui Univ, Sch Artificial Intelligence, Informat Mat & Intelligent Sensing Lab Anhui Prov, Hefei 230601, Peoples R China
3.Anhui Univ, Sch Artificial Intelligence, Anhui Prov Key Lab Multimodal Cognit Computat, Hefei 230601, Peoples R China
4.IIT, Dept Comp Sci, Chicago, IL 60616 USA
5.Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 101408, Peoples R China
6.Chinese Acad Sci, Inst Automat, Ctr Res Intelligent Percept & Comp, Natl Lab Pattern Recognit, Beijing 100049, Peoples R China
7.CAS Ctr Excellence Brain Sci & Intelligence Techno, Beijing 100190, Peoples R China
推荐引用方式
GB/T 7714
Wang, Jiaxiang,Zheng, Aihua,Yan, Yan,et al. Attribute-Guided Cross-Modal Interaction and Enhancement for Audio-Visual Matching[J]. IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY,2024,19:4986-4998.
APA Wang, Jiaxiang,Zheng, Aihua,Yan, Yan,He, Ran,&Tang, Jin.(2024).Attribute-Guided Cross-Modal Interaction and Enhancement for Audio-Visual Matching.IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY,19,4986-4998.
MLA Wang, Jiaxiang,et al."Attribute-Guided Cross-Modal Interaction and Enhancement for Audio-Visual Matching".IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY 19(2024):4986-4998.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Wang, Jiaxiang]的文章
[Zheng, Aihua]的文章
[Yan, Yan]的文章
百度学术
百度学术中相似的文章
[Wang, Jiaxiang]的文章
[Zheng, Aihua]的文章
[Yan, Yan]的文章
必应学术
必应学术中相似的文章
[Wang, Jiaxiang]的文章
[Zheng, Aihua]的文章
[Yan, Yan]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。