CASIA OpenIR

浏览/检索结果: 共49条,第1-10条 帮助

限定条件    
已选(0)清除 条数/页:   排序方式:
Health and Senior Care Video Moment Localization With Procedure Knowledge Distillation 会议论文
, Istanbul, Turkiye, Dec 5-8
作者:  Chaochen Wu;  Meiyun Zuo;  Guan Luo;  Yuna Jiang
Adobe PDF(3140Kb)  |  收藏  |  浏览/下载:12/6  |  提交时间:2024/06/05
基于运动引导的高效无监督视频目标分割网络 期刊论文
自动化学报, 2023, 卷号: 49, 期号: 4, 页码: 872-880
作者:  赵子成;  张开华;  樊佳庆;  刘青山
Adobe PDF(8449Kb)  |  收藏  |  浏览/下载:28/11  |  提交时间:2024/05/09
无监督视频目标分割  运动引导  局部注意力  互注意力  
一种基于成对字向量和噪声鲁棒学习的同义词挖掘算法 期刊论文
自动化学报, 2023, 卷号: 49, 期号: 6, 页码: 1181-1194
作者:  张浩宇;  王戟
Adobe PDF(1420Kb)  |  收藏  |  浏览/下载:4/1  |  提交时间:2024/05/09
同义词挖掘  噪声标签学习  自然语言处理  成对字向量  信息抽取  
基于语境辅助转换器的图像标题生成算法 期刊论文
自动化学报, 2023, 卷号: 49, 期号: 9, 页码: 1889-1903
作者:  连政;  王瑞;  李海昌;  姚辉;  胡晓惠
Adobe PDF(3362Kb)  |  收藏  |  浏览/下载:32/8  |  提交时间:2024/04/24
图像标题生成  注意力机制  转换器  视觉连贯性  
How Good is Google Bard's Visual Understanding? An Empirical Study on Open Challenges 期刊论文
Machine Intelligence Research, 2023, 卷号: 20, 期号: 5, 页码: 605-613
作者:  Haotong Qin;   Ge-Peng Ji;  Salman Khan;  Deng-Ping Fan;  Fahad Shahbaz Khan;  Luc Van Gool
Adobe PDF(10373Kb)  |  收藏  |  浏览/下载:21/4  |  提交时间:2024/04/23
Google Bard, multi-modal understanding, visual comprehension, large language models, conversational AI, chatbot  
Cross-modal Contrastive Learning for Generalizable and Efficient Image-text Retrieval 期刊论文
Machine Intelligence Research, 2023, 卷号: 20, 期号: 4, 页码: 569-582
作者:  Haoyu Lu;  Yuqi Huo;  Mingyu Ding;  Nanyi Fei;  Zhiwu Lu
Adobe PDF(2928Kb)  |  收藏  |  浏览/下载:27/7  |  提交时间:2024/04/23
Image-text retrieval, multimodal modeling, contrastive learning, weak correlation, computer vision  
Transformer: A General Framework from Machine Translation to Others 期刊论文
Machine Intelligence Research, 2023, 卷号: 20, 期号: 4, 页码: 514-538
作者:  Yang Zhao;  Jiajun Zhang;  Chengqing Zong
Adobe PDF(1415Kb)  |  收藏  |  浏览/下载:29/8  |  提交时间:2024/04/23
Neural machine translation, Transformer, document neural machine translation (NMT), multimodal NMT, low-resource NMT  
Large-scale Multi-modal Pre-trained Models: A Comprehensive Survey 期刊论文
Machine Intelligence Research, 2023, 卷号: 20, 期号: 4, 页码: 447-482
作者:  Xiao Wang;  Guangyao Chen;  Guangwu Qian;  Pengcheng Gao;  Xiao-Yong Wei;  Yaowei Wang;  Yonghong Tian;  Wen Gao
Adobe PDF(3540Kb)  |  收藏  |  浏览/下载:34/6  |  提交时间:2024/04/23
Multi-modal (MM), pre-trained model (PTM), information fusion, representation learning, deep learning  
Masked Vision-language Transformer in Fashion 期刊论文
Machine Intelligence Research, 2023, 卷号: 20, 期号: 3, 页码: 421-434
作者:  Ge-Peng Ji;  Mingchen Zhuge;  Dehong Gao;  Deng-Ping Fan;  Christos Sakaridis;  Luc Van Gool
Adobe PDF(2779Kb)  |  收藏  |  浏览/下载:15/3  |  提交时间:2024/04/23
Vision-language, masked image reconstruction, transformer, fashion, e-commercial  
Vision Enhanced Generative Pre-trained Language Model for Multimodal Sentence Summarization 期刊论文
Machine Intelligence Research, 2023, 卷号: 20, 期号: 2, 页码: 289-298
作者:  Liqiang Jing;  Yiren Li;  Junhao Xu;  Yongcan Yu;  Pei Shen;  Xuemeng Song
Adobe PDF(2389Kb)  |  收藏  |  浏览/下载:22/10  |  提交时间:2024/04/23
Multimodal sentence summarization (MMSS)  generative pre-trained language model (GPLM)  natural language generation  deep learning  artificial intelligence