CASIA OpenIR
(本次检索基于用户作品认领结果)

浏览/检索结果: 共18条,第1-10条 帮助

限定条件                
已选(0)清除 条数/页:   排序方式:
SgVA-CLIP: Semantic-Guided Visual Adapting of Vision-Language Models for Few-Shot Image Classification 期刊论文
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 卷号: 26, 页码: 3469-3480
作者:  Peng, Fang;  Yang, Xiaoshan;  Xiao, Linhui;  Wang, Yaowei;  Xu, Changsheng
收藏  |  浏览/下载:7/0  |  提交时间:2024/07/03
Few-shot  image classification  vision-language models  
Reducing Vision-Answer Biases for Multiple-Choice VQA 期刊论文
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 卷号: 32, 页码: 4621-4634
作者:  Zhang, Xi;  Zhang, Feifei;  Xu, Changsheng
Adobe PDF(2684Kb)  |  收藏  |  浏览/下载:91/6  |  提交时间:2023/11/17
Multiple-choice VQA  vision-answer bias  causal intervention  counterfactual interaction learning  
Recovering Generalization via Pre-training-like Knowledge Distillation for Out-of-Distribution Visual Question Answering 期刊论文
IEEE Transactions on Multimedia, 2023, 卷号: 26, 页码: 1-15
作者:  Song, Yaguang;  Yang, Xiaoshan;  Wang, Yaowei;  Xu, Changsheng
Adobe PDF(2397Kb)  |  收藏  |  浏览/下载:200/50  |  提交时间:2023/06/12
Multi-modal Foundation Model  Out-of-Distribution Generalization  Visual Question Answering  Knowledge Distillation  
Weakly-Supervised Video Object Grounding Via Learning Uni-Modal Associations 期刊论文
IEEE Transactions on Multimedia, 2022, 卷号: 25, 页码: 1-12
作者:  Wang, Wei;  Gao, Junyu;  Xu, Changsheng
Adobe PDF(5406Kb)  |  收藏  |  浏览/下载:140/41  |  提交时间:2023/04/25
Visualization  Grounding  Task analysis  Prototypes  Annotations  Uncertainty  Proposals  Cross-modal retrieval  weakly-supervised learning  video object grounding  uni-modal association  
Explicit Cross-Modal Representation Learning for Visual Commonsense Reasoning 期刊论文
IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 卷号: 24, 页码: 2986-2997
作者:  Zhang, Xi;  Zhang, Feifei;  Xu, Changsheng
Adobe PDF(5681Kb)  |  收藏  |  浏览/下载:416/4  |  提交时间:2022/07/25
Cognition  Video recording  Syntactics  Visualization  Task analysis  Semantics  Linguistics  Visual Commonsense Reasoning  explicit reasoning  syntactic structure  interpretability  
Holographic Feature Learning of Egocentric-Exocentric Videos for Multi-Domain Action Recognition 期刊论文
IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 卷号: 24, 页码: 2273-2286
作者:  Huang, Yi;  Yang, Xiaoshan;  Gao, Junyun;  Xu, Changsheng
Adobe PDF(2409Kb)  |  收藏  |  浏览/下载:378/75  |  提交时间:2022/07/25
Videos  Feature extraction  Visualization  Task analysis  Computational modeling  Target recognition  Prototypes  Egocentric videos  exocentric videos  holographic feature  multi-domain  action recognition  
Towards Corruption-Agnostic Robust Domain Adaptation 期刊论文
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2022, 卷号: 18, 期号: 4, 页码: 16
作者:  Xu, Yifan;  Sheng, Kekai;  Dong, Weiming;  Wu, Baoyuan;  Xu, Changsheng;  Hu, Bao-Gang
Adobe PDF(2116Kb)  |  收藏  |  浏览/下载:471/101  |  提交时间:2022/06/10
Domain adaptation  corruption robustness  transfer learning  
Joint Expression Synthesis and Representation Learning for Facial Expression Recognition 期刊论文
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 卷号: 32, 期号: 3, 页码: 1681-1695
作者:  Zhang, Xi;  Zhang, Feifei;  Xu, Changsheng
Adobe PDF(4827Kb)  |  收藏  |  浏览/下载:269/3  |  提交时间:2022/06/06
Face recognition  Task analysis  Generative adversarial networks  Image synthesis  Image recognition  Faces  Training  Facial expression recognition  facial image synthesis  generative adversarial network  representation learning  
Emotion Knowledge Driven Video Highlight Detection 期刊论文
IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 卷号: 23, 页码: 3999-4013
作者:  Qi, Fan;  Yang, Xiaoshan;  Xu, Changsheng
收藏  |  浏览/下载:228/0  |  提交时间:2021/12/28
Visualization  Training data  Predictive models  Training  Semantics  Emotion recognition  Computational modeling  Deep ranking  knowledge graph  video highlight detection  
Transformers in computational visual media: A survey 期刊论文
Computational Visual Media, 2021, 卷号: 8, 期号: 1, 页码: 33-62
作者:  Xu,Yifan;  Wei,Huapeng;  Lin,Minxuan;  Deng,Yingying;  Sheng,Kekai;  Zhang,Mengdan;  Tang,Fan;  Dong,Weiming;  Huang,Feiyue;  Xu,Changsheng
Adobe PDF(5366Kb)  |  收藏  |  浏览/下载:329/47  |  提交时间:2021/12/28
visual transformer  computational visual media (CVM)  high-level vision  low-level vision  image generation  multi-modal learning