CASIA OpenIR

Browse/Search Results:  1-10 of 48 Help

  Show only claimed items
Selected(0)Clear Items/Page:    Sort:
SgVA-CLIP: Semantic-Guided Visual Adapting of Vision-Language Models for Few-Shot Image Classification 期刊论文
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 卷号: 26, 页码: 3469-3480
Authors:  Peng, Fang;  Yang, Xiaoshan;  Xiao, Linhui;  Wang, Yaowei;  Xu, Changsheng
Favorite  |  View/Download:6/0  |  Submit date:2024/07/03
Few-shot  image classification  vision-language models  
Part-aware Prompt Tuning For Weakly Supervised Referring Expression Grounding 会议论文
, Amsterdam, 2024-1-29
Authors:  Chenlin, Zhao;  Jiabo, Ye;  Yaguang, Song;  Ming, Yan;  Xiaoshan, Yang;  Changsheng, Xu
Adobe PDF(6114Kb)  |  Favorite  |  View/Download:26/9  |  Submit date:2024/06/21
CLIP-VG: Self-Paced Curriculum Adapting of CLIP for Visual Grounding 期刊论文
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 卷号: 26, 页码: 4334-4347
Authors:  Xiao, Linhui;  Yang, Xiaoshan;  Peng, Fang;  Yan, Ming;  Wang, Yaowei;  Xu, Changsheng
Favorite  |  View/Download:28/0  |  Submit date:2024/05/30
Grounding  Reliability  Adaptation models  Task analysis  Visualization  Data models  Annotations  Visual grounding  curriculum learning  pseudo-language label  and vision-language models  
Zero-Shot Predicate Prediction for Scene Graph Parsing 期刊论文
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 卷号: 25, 页码: 3140-3153
Authors:  Li, Yiming;  Yang, Xiaoshan;  Huang, Xuhui;  Ma, Zhe;  Xu, Changsheng
Favorite  |  View/Download:164/0  |  Submit date:2023/11/17
Deep learning  zero-shot  scene graph  
Multi-Source Knowledge Reasoning Graph Network for Multi-Modal Commonsense Inference 期刊论文
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 卷号: 19, 期号: 4, 页码: 17
Authors:  Ma, Xuan;  Yang, Xiaoshan;  Xu, Changsheng
Favorite  |  View/Download:102/0  |  Submit date:2023/11/17
Knowledge reasoning  multi-modal commonsense inference  graph neural network  
跨模态多视角自监督的个性化食谱推荐异构图网络 期刊论文
计算机辅助设计与图形学学报, 2023, 卷号: 35, 期号: 3, 页码: 413-422
Authors:  宋亚光;  杨小汕;  徐常胜
Adobe PDF(854Kb)  |  Favorite  |  View/Download:239/90  |  Submit date:2023/06/26
食物推荐  异构图  自监督学习  多视角  跨模态  
Relative Alignment Network for Source-Free Multimodal Video Domain Adaptation 会议论文
MM '22: Proceedings of the 30th ACM International Conference on Multimedia, Lisboa, Portugal, 2022.10.10—2022.10.14
Authors:  Huang Yi;  Yang Xiaoshan;  Zhang Ji;  Xu Changsheng
Adobe PDF(1264Kb)  |  Favorite  |  View/Download:218/87  |  Submit date:2023/06/21
Multimodal Global Relation Knowledge Distillation for Egocentric Action Anticipation 会议论文
MM '21: Proceedings of the 29th ACM International Conference on Multimedia, Chengdu, China, 2021.10.20—2021.10.24
Authors:  Huang Yi;  Yang Xiaoshan;  Xu Changsheng
Adobe PDF(1162Kb)  |  Favorite  |  View/Download:189/74  |  Submit date:2023/06/21
Self-supervised Calorie-aware Heterogeneous Graph Networks for Food Recommendation 期刊论文
ACM Transactions on Multimedia Computing, Communications, and Applications, 2023, 卷号: 19, 期号: 1s, 页码: 1-23
Authors:  Song, Yaguang;  Yang, Xiaoshan;  Xu, Changsheng
Adobe PDF(1381Kb)  |  Favorite  |  View/Download:215/66  |  Submit date:2023/06/12
Food recommendation  recipe calories  heterogeneous graph  selfsupervised learning  
Recovering Generalization via Pre-training-like Knowledge Distillation for Out-of-Distribution Visual Question Answering 期刊论文
IEEE Transactions on Multimedia, 2023, 卷号: 26, 页码: 1-15
Authors:  Song, Yaguang;  Yang, Xiaoshan;  Wang, Yaowei;  Xu, Changsheng
Adobe PDF(2397Kb)  |  Favorite  |  View/Download:199/50  |  Submit date:2023/06/12
Multi-modal Foundation Model  Out-of-Distribution Generalization  Visual Question Answering  Knowledge Distillation