CASIA OpenIR

浏览/检索结果: 共52条,第1-10条 帮助

限定条件    
已选(0)清除 条数/页:   排序方式:
Modal Contrastive Learning Based End-to-End Text Image Machine Translation 期刊论文
IEEE/ACM Transactions on Audio, Speech, and Language Processing (IEEE/ACM TASLP), 2023, 卷号: 32, 期号: 32, 页码: 2153-2165
作者:  Ma, Cong;  Han, Xu;  Wu, Linghui;  Zhang, Yaping;  Zhao, Yang;  Zhou, Yu;  Zong, Chengqing
Adobe PDF(6551Kb)  |  收藏  |  浏览/下载:20/9  |  提交时间:2024/06/26
Transformers  Machine translation  Decoding  Semantics  Pipelines  Text recognition  Task analysis  Text image machine translation  contrastive learning  text image recognition  machine translation  
DARTScore: DuAl-Reconstruction Transformer for Video Captioning Evaluation 期刊论文
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 卷号: 34, 期号: 4, 页码: 2041-2055
作者:  Chen, Yuxin;  Zhang, Ziqi;  Qi, Zhongang;  Yuan, Chunfeng;  Wang, Jie;  Shan, Ying;  Li, Bing;  Hu, Weiming;  Qie, Xiaohu;  Wu, Jianping
Adobe PDF(13765Kb)  |  收藏  |  浏览/下载:37/1  |  提交时间:2024/05/30
Chinese video captioning evaluation  dual-reconstruction transformer  
从视频到语言:视频标题生成与描述研究综述 期刊论文
自动化学报, 2022, 卷号: 48, 期号: 2, 页码: 375-397
作者:  汤鹏杰;  王瀚漓
Adobe PDF(8546Kb)  |  收藏  |  浏览/下载:45/7  |  提交时间:2024/05/20
视频描述  卷积神经网络  循环神经网络  语段生成  情感表达  逻辑语义  
基于语境辅助转换器的图像标题生成算法 期刊论文
自动化学报, 2023, 卷号: 49, 期号: 9, 页码: 1889-1903
作者:  连政;  王瑞;  李海昌;  姚辉;  胡晓惠
Adobe PDF(3362Kb)  |  收藏  |  浏览/下载:47/11  |  提交时间:2024/04/24
图像标题生成  注意力机制  转换器  视觉连贯性  
Cogeneration of Innovative Audio-visual Content: A New Challenge for Computing Art 期刊论文
Machine Intelligence Research, 2024, 卷号: 21, 期号: 1, 页码: 4-28
作者:  Mengting Liu;  Ying Zhou;  Yuwei Wu;  Feng Gao
Adobe PDF(14438Kb)  |  收藏  |  浏览/下载:48/5  |  提交时间:2024/04/23
Artificial intelligence (AI) art, audio-visual, artificial intelligence generated content (AIGC), multimodal, artistic evaluation  
State of the Art on Deep Learning-enhanced Rendering Methods 期刊论文
Machine Intelligence Research, 2023, 卷号: 20, 期号: 6, 页码: 799-821
作者:  Qi Wang;  Zhihua Zhong;  Yuchi Huo;  Hujun Bao;  Rui Wang
Adobe PDF(6540Kb)  |  收藏  |  浏览/下载:60/22  |  提交时间:2024/04/23
Neural rendering, computer graphics, scene representation, rendering, post-processing  
Large-scale Multi-modal Pre-trained Models: A Comprehensive Survey 期刊论文
Machine Intelligence Research, 2023, 卷号: 20, 期号: 4, 页码: 447-482
作者:  Xiao Wang;  Guangyao Chen;  Guangwu Qian;  Pengcheng Gao;  Xiao-Yong Wei;  Yaowei Wang;  Yonghong Tian;  Wen Gao
Adobe PDF(3540Kb)  |  收藏  |  浏览/下载:52/10  |  提交时间:2024/04/23
Multi-modal (MM), pre-trained model (PTM), information fusion, representation learning, deep learning  
Masked Vision-language Transformer in Fashion 期刊论文
Machine Intelligence Research, 2023, 卷号: 20, 期号: 3, 页码: 421-434
作者:  Ge-Peng Ji;  Mingchen Zhuge;  Dehong Gao;  Deng-Ping Fan;  Christos Sakaridis;  Luc Van Gool
Adobe PDF(2779Kb)  |  收藏  |  浏览/下载:21/6  |  提交时间:2024/04/23
Vision-language, masked image reconstruction, transformer, fashion, e-commercial  
VLP: A Survey on Vision-language Pre-training 期刊论文
Machine Intelligence Research, 2023, 卷号: 20, 期号: 1, 页码: 38-56
作者:  Fei-Long Chen;  Du-Zhen Zhang;  Ming-Lun Han;  Xiu-Yi Chen;  Jing Shi;  Shuang Xu;  Bo Xu
Adobe PDF(1427Kb)  |  收藏  |  浏览/下载:47/14  |  提交时间:2024/04/23
Vision and language  pre-training  transformers  multimodal learning  representation learning  
Federated Learning with Privacy-preserving and Model IP-right-protection 期刊论文
Machine Intelligence Research, 2023, 卷号: 20, 期号: 1, 页码: 19-37
作者:  Qiang Yang;  Anbu Huang;  Lixin Fan;  Chee Seng Chan;  Jian Han Lim;  Kam Woh Ng;  Ding Sheng Ong;  Bowen Li
Adobe PDF(2634Kb)  |  收藏  |  浏览/下载:25/7  |  提交时间:2024/04/23
Federated learning  privacy-preserving machine learning  security  decentralized learning  intellectual property protection