CASIA OpenIR

Browse/Search Results:  1-5 of 5 Help

Selected(0)Clear Items/Page:    Sort:
Large-scale Multi-modal Pre-trained Models: A Comprehensive Survey 期刊论文
Machine Intelligence Research, 2023, 卷号: 20, 期号: 4, 页码: 447-482
Authors:  Xiao Wang;  Guangyao Chen;  Guangwu Qian;  Pengcheng Gao;  Xiao-Yong Wei;  Yaowei Wang;  Yonghong Tian;  Wen Gao
Adobe PDF(3540Kb)  |  Favorite  |  View/Download:57/12  |  Submit date:2024/04/23
Multi-modal (MM), pre-trained model (PTM), information fusion, representation learning, deep learning  
Masked Vision-language Transformer in Fashion 期刊论文
Machine Intelligence Research, 2023, 卷号: 20, 期号: 3, 页码: 421-434
Authors:  Ge-Peng Ji;  Mingchen Zhuge;  Dehong Gao;  Deng-Ping Fan;  Christos Sakaridis;  Luc Van Gool
Adobe PDF(2779Kb)  |  Favorite  |  View/Download:23/7  |  Submit date:2024/04/23
Vision-language, masked image reconstruction, transformer, fashion, e-commercial  
Pre-training in Medical Data: A Survey 期刊论文
Machine Intelligence Research, 2023, 卷号: 20, 期号: 2, 页码: 147-149
Authors:  Yixuan Qiu;  Feng Lin;  Weitong Chen;  Miao Xu
Adobe PDF(2262Kb)  |  Favorite  |  View/Download:49/14  |  Submit date:2024/04/23
Medical data  pre-training  transfer learning  self-supervised learning  medical image data  electrocardiograms (ECG) data  
Second-Order Global Attention Networks for Graph Classification and Regression 会议论文
, Beijing, China, August 27-28, 2022
Authors:  Hu Fenyu;  Cui Zeyu;  Wu Shu;  Liu Qiang;  Wu Jinlin;  Wang Liang;  Tan Tieniu
Adobe PDF(69424Kb)  |  Favorite  |  View/Download:221/71  |  Submit date:2023/07/06
从视频到语言:视频描述和标题生成方法研究 学位论文
, 中国科学院自动化研究所: 中国科学院自动化研究所, 2022
Authors:  张子琦
Adobe PDF(19170Kb)  |  Favorite  |  View/Download:1163/15  |  Submit date:2022/06/16
视觉与语言  视频内容描述  视频标题生成  外部语言模型  开卷视频描述  中文短视频-文本基准  大规模多模态预训练