CASIA OpenIR  > 模式识别实验室
Adversarial-Metric Learning for Audio-Visual Cross-Modal Matching
Zheng, Aihua1,2; Hu, Menglan1,2; Jiang, Bo1,2,3; Huang, Yan4; Yan, Yan5; Luo, Bin1,2
发表期刊IEEE TRANSACTIONS ON MULTIMEDIA
ISSN1520-9210
2022
卷号24页码:338-351
通讯作者Jiang, Bo(jiangbo@ahu.edu.cn)
摘要Audio-visual matching aims to learn the intrinsic correspondence between image and audio clip. Existing works mainly concentrate on learning discriminative features, while ignore the cross-modal heterogeneous issue between audio and visual modalities. To deal with this issue, we propose a novel Adversarial-Metric Learning (AML) model for audio-visual matching. AML aims to generate a modality-independent representation for each person in each modality via adversarial learning, while simultaneously learns a robust similarity measure for cross-modality matching via metric learning. By integrating the discriminative modality-independent representation and robust cross-modality metric learning into an end-to-end trainable deep network, AML can overcome the heterogeneous issue with promising performance for audio-visual matching. Experiments on the various audio-visual learning tasks, including audio-visual matching, audio-visual verification and audio-visual retrieval on benchmark dataset demonstrate the effectiveness of the proposed AML model. The implementation codes are available on https://github.com/MLanHu/AML.
关键词Visualization Task analysis Measurement Speech recognition Videos Location awareness Image recognition Adversarial learning audio-visual matching cross-modal learning metric learning
DOI10.1109/TMM.2021.3050089
关键词[WOS]FACE ; IDENTITY ; SPEECH ; VOICE
收录类别SCI
语种英语
资助项目Major Project for New Generation of AI[2018AAA0100400] ; National Natural Science Foundation of China[61976002] ; National Natural Science Foundation of China[62076004] ; Natural Science Foundation of Anhui Higher Education Institutions of China[KJ2019A0033] ; Open Project Program of the National Laboratory of Pattern Recognition (NLPR)[201900046] ; Cooperative Research Project Program of Nanjing Artificial Intelligence Chip Research, Institute of Automation, Chinese Academy of Sciences
项目资助者Major Project for New Generation of AI ; National Natural Science Foundation of China ; Natural Science Foundation of Anhui Higher Education Institutions of China ; Open Project Program of the National Laboratory of Pattern Recognition (NLPR) ; Cooperative Research Project Program of Nanjing Artificial Intelligence Chip Research, Institute of Automation, Chinese Academy of Sciences
WOS研究方向Computer Science ; Telecommunications
WOS类目Computer Science, Information Systems ; Computer Science, Software Engineering ; Telecommunications
WOS记录号WOS:000745524300026
出版者IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
引用统计
被引频次:28[WOS]   [WOS记录]     [WOS相关记录]
文献类型期刊论文
条目标识符http://ir.ia.ac.cn/handle/173211/47344
专题模式识别实验室
通讯作者Jiang, Bo
作者单位1.Minist Educ, Key Lab Intelligent Comp & Signal Proc, Anhui Prov Key Lab Multimodal Cognit Computat, Hefei, Peoples R China
2.Anhui Univ, Sch Comp Sci & Technol, Hefei 230601, Peoples R China
3.Anhui Univ, Inst Phys Sci & Informat Technol, Hefei 230601, Peoples R China
4.Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China
5.IIT, Dept Comp Sci, Chicago, IL 60616 USA
推荐引用方式
GB/T 7714
Zheng, Aihua,Hu, Menglan,Jiang, Bo,et al. Adversarial-Metric Learning for Audio-Visual Cross-Modal Matching[J]. IEEE TRANSACTIONS ON MULTIMEDIA,2022,24:338-351.
APA Zheng, Aihua,Hu, Menglan,Jiang, Bo,Huang, Yan,Yan, Yan,&Luo, Bin.(2022).Adversarial-Metric Learning for Audio-Visual Cross-Modal Matching.IEEE TRANSACTIONS ON MULTIMEDIA,24,338-351.
MLA Zheng, Aihua,et al."Adversarial-Metric Learning for Audio-Visual Cross-Modal Matching".IEEE TRANSACTIONS ON MULTIMEDIA 24(2022):338-351.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Zheng, Aihua]的文章
[Hu, Menglan]的文章
[Jiang, Bo]的文章
百度学术
百度学术中相似的文章
[Zheng, Aihua]的文章
[Hu, Menglan]的文章
[Jiang, Bo]的文章
必应学术
必应学术中相似的文章
[Zheng, Aihua]的文章
[Hu, Menglan]的文章
[Jiang, Bo]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。