CASIA OpenIR

Browse/Search Results:  1-7 of 7 Help

Selected(0)Clear Items/Page:    Sort:
SgVA-CLIP: Semantic-Guided Visual Adapting of Vision-Language Models for Few-Shot Image Classification 期刊论文
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 卷号: 26, 页码: 3469-3480
Authors:  Peng, Fang;  Yang, Xiaoshan;  Xiao, Linhui;  Wang, Yaowei;  Xu, Changsheng
Favorite  |  View/Download:6/0  |  Submit date:2024/07/03
Few-shot  image classification  vision-language models  
CLIP-VG: Self-Paced Curriculum Adapting of CLIP for Visual Grounding 期刊论文
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 卷号: 26, 页码: 4334-4347
Authors:  Xiao, Linhui;  Yang, Xiaoshan;  Peng, Fang;  Yan, Ming;  Wang, Yaowei;  Xu, Changsheng
Favorite  |  View/Download:28/0  |  Submit date:2024/05/30
Grounding  Reliability  Adaptation models  Task analysis  Visualization  Data models  Annotations  Visual grounding  curriculum learning  pseudo-language label  and vision-language models  
Large-scale Multi-modal Pre-trained Models: A Comprehensive Survey 期刊论文
Machine Intelligence Research, 2023, 卷号: 20, 期号: 4, 页码: 447-482
Authors:  Xiao Wang;  Guangyao Chen;  Guangwu Qian;  Pengcheng Gao;  Xiao-Yong Wei;  Yaowei Wang;  Yonghong Tian;  Wen Gao
Adobe PDF(3540Kb)  |  Favorite  |  View/Download:54/10  |  Submit date:2024/04/23
Multi-modal (MM), pre-trained model (PTM), information fusion, representation learning, deep learning  
AAformer: Auto-Aligned Transformer for Person Re-Identification 期刊论文
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 页码: 11
Authors:  Zhu, Kuan;  Guo, Haiyun;  Zhang, Shiliang;  Wang, Yaowei;  Liu, Jing;  Wang, Jinqiao;  Tang, Ming
Favorite  |  View/Download:158/0  |  Submit date:2023/11/16
Auto-alignment  part-level representation  person re-identification (re-ID)  transformer  
Recovering Generalization via Pre-training-like Knowledge Distillation for Out-of-Distribution Visual Question Answering 期刊论文
IEEE Transactions on Multimedia, 2023, 卷号: 26, 页码: 1-15
Authors:  Song, Yaguang;  Yang, Xiaoshan;  Wang, Yaowei;  Xu, Changsheng
Adobe PDF(2397Kb)  |  Favorite  |  View/Download:197/49  |  Submit date:2023/06/12
Multi-modal Foundation Model  Out-of-Distribution Generalization  Visual Question Answering  Knowledge Distillation  
Towards More Flexible and Accurate Object Tracking with Natural Language: Algorithms and Benchmark 会议论文
, Virtual, 2021-7
Authors:  Wang, Xiao;  Shu, Xiujun;  Zhang, Zhipeng;  Jiang, Bo;  Wang, Yaowei;  Tian, Yonghong;  Wu, Feng
Adobe PDF(5464Kb)  |  Favorite  |  View/Download:254/103  |  Submit date:2022/06/14
Large Batch Optimization for Object Detection: Training COCO in 12 minutes 会议论文
, Online, 2020-8-24
Authors:  Wang, Tong;  Zhu, Yousong;  Zhao, Chaoyang;  Zeng, Wei;  Wang, Yaowei;  Wang, Jinqiao;  Tang, Ming
Adobe PDF(3706Kb)  |  Favorite  |  View/Download:279/41  |  Submit date:2022/04/01
Object detection  Large batch optimization  Periodical moments decay