CASIA OpenIR

浏览/检索结果: 共648条,第1-10条 帮助

限定条件    
已选(0)清除 条数/页:   排序方式:
Scene text recognition via dual character counting-aware visual and semantic modeling network 期刊论文
SCIENCE CHINA-INFORMATION SCIENCES, 2024, 卷号: 67, 期号: 3, 页码: 2
作者:  Xiao, Ke;  Zhu, Anna;  Iwana, Brian Kenji;  Liu, Cheng-Lin
收藏  |  浏览/下载:42/0  |  提交时间:2024/03/13
Reparameterizing and dynamically quantizing image features for image generation 期刊论文
PATTERN RECOGNITION, 2024, 卷号: 146, 页码: 11
作者:  Sun, Mingzhen;  Wang, Weining;  Zhu, Xinxin;  Liu, Jing
Adobe PDF(3612Kb)  |  收藏  |  浏览/下载:79/7  |  提交时间:2023/12/21
Vector quantization  Variational auto-encoder  Unconditional image generation  Text-to-image generation  Autoregressive generation  
VLP2MSA: Expanding vision-language pre-training to multimodal sentiment analysis 期刊论文
KNOWLEDGE-BASED SYSTEMS, 2024, 卷号: 283, 页码: 9
作者:  Yi, Guofeng;  Fan, Cunhang;  Zhu, Kang;  Lv, Zhao;  Liang, Shan;  Wen, Zhengqi;  Pei, Guanxiong;  Li, Taihao;  Tao, Jianhua
收藏  |  浏览/下载:42/0  |  提交时间:2024/02/22
Multimodal sentiment analysis  Vision-language  Multimodal fusion  
GAN-Based Facial Attribute Manipulation 期刊论文
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 卷号: 45, 期号: 12, 页码: 14590-14610
作者:  Liu, Yunfan;  Li, Qi;  Deng, Qiyao;  Sun, Zhenan;  Yang, Ming-Hsuan
Adobe PDF(15297Kb)  |  收藏  |  浏览/下载:26/8  |  提交时间:2024/02/22
Generative adversarial networks  image translation  facial attribute manipulation  
A New Lightweight Script Independent Scene Text Style Transfer Network 期刊论文
INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2023, 页码: 29
作者:  Shivakumara, Palaiahnakote;  Roy, Ayush;  Nandanwar, Lokesh;  Pal, Umapada;  Lu, Yue;  Liu, Cheng-Lin
收藏  |  浏览/下载:15/0  |  提交时间:2024/02/22
Text detection  style transfer  CNN models  multi-lingual transfer  
VQAPT: A New visual question answering model for personality traits in social media images 期刊论文
PATTERN RECOGNITION LETTERS, 2023, 卷号: 175, 页码: 66-73
作者:  Biswas, Kunal;  Shivakumara, Palaiahnakote;  Pal, Umapada;  Liu, Cheng-Lin;  Lu, Yue
收藏  |  浏览/下载:26/0  |  提交时间:2024/02/22
Personality trait images  Multimodal concept  Text recognition  Social media images  Natural language processing  Visual question answering  
SignParser: An End-to-End Framework for Traffic Sign Understanding 期刊论文
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2023, 页码: 17
作者:  Guo, Yunfei;  Feng, Wei;  Yin, Fei;  Liu, Cheng-Lin
收藏  |  浏览/下载:53/0  |  提交时间:2023/12/21
Traffic sign understanding  Content reasoning  Semantic description generation  
Two Birds With One Stone: Knowledge-Embedded Temporal Convolutional Transformer for Depression Detection and Emotion Recognition 期刊论文
IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2023, 卷号: 14, 期号: 4, 页码: 2595-2613
作者:  Zheng, Wenbo;  Yan, Lan;  Wang, Fei-Yue
收藏  |  浏览/下载:15/0  |  提交时间:2024/03/27
Multimodal depression detection  multimodal emotion recognition  transformer  knowledge embedding  joint learning  
Understanding and Mitigating Overfitting in Prompt Tuning for Vision-Language Models 期刊论文
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 卷号: 33, 期号: 9, 页码: 4616-4629
作者:  Ma, Chengcheng;  Liu, Yang;  Deng, Jiankang;  Xie, Lingxi;  Dong, Weiming;  Xu, Changsheng
Adobe PDF(1644Kb)  |  收藏  |  浏览/下载:77/11  |  提交时间:2023/11/16
Vision-language model  prompt tuning  over-fitting  subspace learning  gradient projection  
Unsupervised Dialogue State Tracking for End-to-End Task-Oriented Dialogue with a Multi-Span Prediction Network 期刊论文
JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2023, 卷号: 38, 期号: 4, 页码: 834-852
作者:  Liu, Qing-Bin;  He, Shi-Zhu;  Liu, Cao;  Liu, Kang;  Zhao, Jun
收藏  |  浏览/下载:15/0  |  提交时间:2024/02/22
end-to-end task-oriented dialogue  dialogue state tracking (DST)  unsupervised learning  reinforcement learning