CASIA OpenIR

浏览/检索结果: 共412条,第1-10条 帮助

限定条件    
已选(0)清除 条数/页:   排序方式:
Scene text recognition via dual character counting-aware visual and semantic modeling network 期刊论文
SCIENCE CHINA-INFORMATION SCIENCES, 2024, 卷号: 67, 期号: 3, 页码: 2
作者:  Xiao, Ke;  Zhu, Anna;  Iwana, Brian Kenji;  Liu, Cheng-Lin
收藏  |  浏览/下载:47/0  |  提交时间:2024/03/13
VLP2MSA: Expanding vision-language pre-training to multimodal sentiment analysis 期刊论文
KNOWLEDGE-BASED SYSTEMS, 2024, 卷号: 283, 页码: 9
作者:  Yi, Guofeng;  Fan, Cunhang;  Zhu, Kang;  Lv, Zhao;  Liang, Shan;  Wen, Zhengqi;  Pei, Guanxiong;  Li, Taihao;  Tao, Jianhua
收藏  |  浏览/下载:53/0  |  提交时间:2024/02/22
Multimodal sentiment analysis  Vision-language  Multimodal fusion  
Biphasic Face Photo-Sketch Synthesis via Semantic-Driven Generative Adversarial Network With Graph Representation Learning 期刊论文
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 页码: 14
作者:  Qi, Xingqun;  Sun, Muyi;  Wang, Zijian;  Liu, Jiaming;  Li, Qi;  Zhao, Fang;  Zhang, Shanghang;  Shan, Caifeng
Adobe PDF(6718Kb)  |  收藏  |  浏览/下载:74/29  |  提交时间:2024/02/22
Face photo-sketch synthesis  generative adversarial network  graph representation learning  intraclass and interclass  iterative cycle training (ICT)  
Large sequence models for sequential decision-making: a survey 期刊论文
FRONTIERS OF COMPUTER SCIENCE, 2023, 卷号: 17, 期号: 6, 页码: 18
作者:  Wen, Muning;  Lin, Runji;  Wang, Hanjing;  Yang, Yaodong;  Wen, Ying;  Mai, Luo;  Wang, Jun;  Zhang, Haifeng;  Zhang, Weinan
收藏  |  浏览/下载:84/0  |  提交时间:2023/11/17
sequential decision-making  sequence modeling  the Transformer  training system  
Multi-modal spatial relational attention networks for visual question answering 期刊论文
IMAGE AND VISION COMPUTING, 2023, 卷号: 140, 页码: 13
作者:  Yao, Haibo;  Wang, Lipeng;  Cai, Chengtao;  Sun, Yuxin;  Zhang, Zhi;  Luo, Yongkang
收藏  |  浏览/下载:46/0  |  提交时间:2024/02/22
Visual question answering  Spatial relation  Attention mechanism  Pre -training strategy  
GAN-Based Facial Attribute Manipulation 期刊论文
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 卷号: 45, 期号: 12, 页码: 14590-14610
作者:  Liu, Yunfan;  Li, Qi;  Deng, Qiyao;  Sun, Zhenan;  Yang, Ming-Hsuan
Adobe PDF(15297Kb)  |  收藏  |  浏览/下载:33/10  |  提交时间:2024/02/22
Generative adversarial networks  image translation  facial attribute manipulation  
So Many Heads, So Many Wits: Multimodal Graph Reasoning for Text-Based Visual Question Answering 期刊论文
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2023, 页码: 12
作者:  Zheng, Wenbo;  Yan, Lan;  Wang, Fei-Yue
收藏  |  浏览/下载:70/0  |  提交时间:2023/12/21
Graph attention  graph reasoning  multimodal graph  self-attention  text-based visual question answering  
GAIA-Universe: Everything is Super-Netify 期刊论文
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 卷号: 45, 期号: 10, 页码: 11856-11868
作者:  Peng, Junran;  Chang, Qing;  Yin, Haoran;  Bu, Xingyuan;  Sun, Jiajun;  Xie, Lingxi;  Zhang, Xiaopeng;  Tian, Qi;  Zhang, Zhaoxiang
收藏  |  浏览/下载:79/0  |  提交时间:2023/11/16
Computer vision  object detection  semantic segmentation  AutoML  
Knowledge-Embedded Mutual Guidance for Visual Reasoning 期刊论文
IEEE TRANSACTIONS ON CYBERNETICS, 2023, 页码: 13
作者:  Zheng, Wenbo;  Yan, Lan;  Chen, Long;  Li, Qiang;  Wang, Fei-Yue
收藏  |  浏览/下载:91/0  |  提交时间:2023/11/16
Attention model  joint learning  knowledge embedding  visual reasoning  
Understanding and Mitigating Overfitting in Prompt Tuning for Vision-Language Models 期刊论文
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 卷号: 33, 期号: 9, 页码: 4616-4629
作者:  Ma, Chengcheng;  Liu, Yang;  Deng, Jiankang;  Xie, Lingxi;  Dong, Weiming;  Xu, Changsheng
Adobe PDF(1644Kb)  |  收藏  |  浏览/下载:91/12  |  提交时间:2023/11/16
Vision-language model  prompt tuning  over-fitting  subspace learning  gradient projection