CASIA OpenIR
(本次检索基于用户作品认领结果)

浏览/检索结果: 共27条,第1-10条 帮助

限定条件            
已选(0)清除 条数/页:   排序方式:
BEVBert: Multimodal Map Pre-training for Language-guided Navigation 会议论文
Proceedings of the IEEE International Conference on Computer Vision, Paris, France, 2023-10-2
作者:  Dong An;  Yuankai Qi;  Yangguang Li;  Yan Huang;  Liang Wang;  Tieniu Tan;  Jing Shao
Adobe PDF(1722Kb)  |  收藏  |  浏览/下载:27/7  |  提交时间:2024/05/28
Neighbor-view Enhanced Model for Vision and Language Navigation 会议论文
Proceedings of the ACM International Conference on Multimedia, Chengdu, China, 2021-10-20
作者:  Dong An;  Yuankai Qi;  Yan Huang;  Qi Wu;  Liang Wang;  Tieniu Tan
Adobe PDF(2412Kb)  |  收藏  |  浏览/下载:16/5  |  提交时间:2024/05/28
Text-to-Image Vehicle Re-Identification: Multi-Scale Multi-View Cross-Modal Alignment Network and a Unified Benchmark 期刊论文
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 页码: 14
作者:  Ding, Leqi;  Liu, Lei;  Huang, Yan;  Li, Chenglong;  Zhang, Cheng;  Wang, Wei;  Wang, Liang
收藏  |  浏览/下载:53/0  |  提交时间:2024/03/27
Task analysis  Feature extraction  Visualization  Training  Electronic mail  Benchmark testing  Trajectory  Text-to-image vehicle re-identification  cross-modal alignment  multi-scale multi-view analysis  benchmark dataset  
Latent Structure Mining With Contrastive Modality Fusion for Multimedia Recommendation 期刊论文
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 卷号: 35, 期号: 9, 页码: 9154-9167
作者:  Zhang, Jinghao;  Zhu, Yanqiao;  Liu, Qiang;  Zhang, Mengqi;  Wu, Shu;  Wang, Liang
Adobe PDF(1134Kb)  |  收藏  |  浏览/下载:128/2  |  提交时间:2023/11/17
Multimedia recommendation  graph structure learning  contrastive learning  
Identifying Sinus Invasion in Meningioma Patients before Surgery with Deep Learning 会议论文
, 线上, 2022-4
作者:  Qi Qiu;  Kai Sun;  Jing Zhang;  Panpan Liu;  Liang Wang;  Junting Zhang;  Junlin Zhou;  Zhenyu Liu;  Jie Tian
Adobe PDF(277Kb)  |  收藏  |  浏览/下载:183/48  |  提交时间:2023/06/28
Deep learning  Meningioma  Sinus invasion  Multimodal fusion  
Joint Token and Feature Alignment Framework for Text-Based Person Search 期刊论文
IEEE SIGNAL PROCESSING LETTERS, 2022, 卷号: 29, 页码: 2238-2242
作者:  Li, Shangze;  Lu, Andong;  Huang, Yan;  Li, Chenglong;  Wang, Liang
收藏  |  浏览/下载:221/0  |  提交时间:2022/12/27
Feature extraction  Visualization  Representation learning  Logic gates  Image reconstruction  Transformers  Training  Cross-modal generation  feature alignment  text-based person search  token alignment  transformer  
Towards Unconstrained Pointing Problem of Visual Question Answering: A Retrieval-based Method 会议论文
, 北京国际会议中心, 2018-08
作者:  Cheng, Wenlong;  Huang, Yan;  Wang, Liang
Adobe PDF(351Kb)  |  收藏  |  浏览/下载:177/34  |  提交时间:2022/06/14
A Reconstruction-based Visual-Acoustic-Semantic Embedding Method for Speech-Image Retrieval 期刊论文
IEEE Transactions on Multimedia, 2022, 页码: 14
作者:  Cheng, Wenlong;  Tang, Wei;  Huang, Yan;  Luo, Yiwen;  Wang, Liang
Adobe PDF(1628Kb)  |  收藏  |  浏览/下载:261/93  |  提交时间:2022/06/14
Mining Latent Structures for Multimedia Recommendation 会议论文
, Chengdu, China, 2021.10.20-2021.10.24
作者:  Zhang, Jinghao;  Zhu, Yanqiao;  Liu, Qiang;  Wu, Shu;  Wang, Shuhui;  Wang, Liang
Adobe PDF(3070Kb)  |  收藏  |  浏览/下载:194/52  |  提交时间:2022/04/07
Multi-Modality Self-Distillation for Weakly Supervised Temporal Action Localization 期刊论文
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 卷号: 31, 页码: 1504-1519
作者:  Huang, Linjiang;  Wang, Liang;  Li, Hongsheng
收藏  |  浏览/下载:217/0  |  提交时间:2022/03/17
Location awareness  Reliability  Noise measurement  Annotations  Training  Head  Task analysis  Weakly supervised temporal action localization  multi-modality  pseudo label  self-distillation