CASIA OpenIR

浏览/检索结果: 共14条,第1-10条 帮助

限定条件                
已选(0)清除 条数/页:   排序方式:
What Does Sora Show: The Beginning of TAO to Imaginative Intelligence and Scenarios Engineering 期刊论文
IEEE/CAA Journal of Automatica Sinica, 2024, 卷号: 11, 期号: 4, 页码: 809-815
作者:  Fei-Yue Wang;  Qinghai Miao;  Lingxi Li;  Qinghua Ni;  Xuan Li;  Juanjuan Li;  Lili Fan;  Yonglin Tian;  Qing-Long Han
Adobe PDF(571Kb)  |  收藏  |  浏览/下载:16/4  |  提交时间:2024/03/18
Visual Semantic Segmentation Based on Few/Zero-Shot Learning: An Overview 期刊论文
IEEE/CAA Journal of Automatica Sinica, 2024, 卷号: 11, 期号: 5, 页码: 1106-1126
作者:  Wenqi Ren;  Yang Tang;  Qiyu Sun;  Chaoqiang Zhao;  Qing-Long Han
Adobe PDF(12695Kb)  |  收藏  |  浏览/下载:11/2  |  提交时间:2024/04/10
Computer vision  deep learning  few-shot learning  low-shot learning  semantic segmentation  zero-shot learning  
Cogeneration of Innovative Audio-visual Content: A New Challenge for Computing Art 期刊论文
Machine Intelligence Research, 2024, 卷号: 21, 期号: 1, 页码: 4-28
作者:  Mengting Liu;  Ying Zhou;  Yuwei Wu;  Feng Gao
Adobe PDF(14438Kb)  |  收藏  |  浏览/下载:2/1  |  提交时间:2024/04/23
Artificial intelligence (AI) art, audio-visual, artificial intelligence generated content (AIGC), multimodal, artistic evaluation  
State of the Art on Deep Learning-enhanced Rendering Methods 期刊论文
Machine Intelligence Research, 2023, 卷号: 20, 期号: 6, 页码: 799-821
作者:  Qi Wang;  Zhihua Zhong;  Yuchi Huo;  Hujun Bao;  Rui Wang
Adobe PDF(6540Kb)  |  收藏  |  浏览/下载:7/2  |  提交时间:2024/04/23
Neural rendering, computer graphics, scene representation, rendering, post-processing  
VLP: A Survey on Vision-language Pre-training 期刊论文
Machine Intelligence Research, 2023, 卷号: 20, 期号: 1, 页码: 38-56
作者:  Fei-Long Chen;  Du-Zhen Zhang;  Ming-Lun Han;  Xiu-Yi Chen;  Jing Shi;  Shuang Xu;  Bo Xu
Adobe PDF(1427Kb)  |  收藏  |  浏览/下载:2/1  |  提交时间:2024/04/23
Vision and language  pre-training  transformers  multimodal learning  representation learning  
Federated Learning with Privacy-preserving and Model IP-right-protection 期刊论文
Machine Intelligence Research, 2023, 卷号: 20, 期号: 1, 页码: 19-37
作者:  Qiang Yang;  Anbu Huang;  Lixin Fan;  Chee Seng Chan;  Jian Han Lim;  Kam Woh Ng;  Ding Sheng Ong;  Bowen Li
Adobe PDF(2634Kb)  |  收藏  |  浏览/下载:6/2  |  提交时间:2024/04/23
Federated learning  privacy-preserving machine learning  security  decentralized learning  intellectual property protection  
基于语境辅助转换器的图像标题生成算法 期刊论文
自动化学报, 2023, 卷号: 49, 期号: 9, 页码: 1889-1903
作者:  连政;  王瑞;  李海昌;  姚辉;  胡晓惠
Adobe PDF(3362Kb)  |  收藏  |  浏览/下载:3/1  |  提交时间:2024/04/24
图像标题生成  注意力机制  转换器  视觉连贯性  
Large-scale Multi-modal Pre-trained Models: A Comprehensive Survey 期刊论文
Machine Intelligence Research, 2023, 卷号: 20, 期号: 4, 页码: 447-482
作者:  Xiao Wang;  Guangyao Chen;  Guangwu Qian;  Pengcheng Gao;  Xiao-Yong Wei;  Yaowei Wang;  Yonghong Tian;  Wen Gao
Adobe PDF(3540Kb)  |  收藏  |  浏览/下载:5/0  |  提交时间:2024/04/23
Multi-modal (MM), pre-trained model (PTM), information fusion, representation learning, deep learning  
Masked Vision-language Transformer in Fashion 期刊论文
Machine Intelligence Research, 2023, 卷号: 20, 期号: 3, 页码: 421-434
作者:  Ge-Peng Ji;  Mingchen Zhuge;  Dehong Gao;  Deng-Ping Fan;  Christos Sakaridis;  Luc Van Gool
Adobe PDF(2779Kb)  |  收藏  |  浏览/下载:4/2  |  提交时间:2024/04/23
Vision-language, masked image reconstruction, transformer, fashion, e-commercial  
Visuals to Text: A Comprehensive Review on Automatic Image Captioning 期刊论文
IEEE/CAA Journal of Automatica Sinica, 2022, 卷号: 9, 期号: 8, 页码: 1339-1365
作者:  Yue Ming;  Nannan Hu;  Chunxiao Fan;  Fan Feng;  Jiangwan Zhou;  Hui Yu
Adobe PDF(56128Kb)  |  收藏  |  浏览/下载:150/21  |  提交时间:2022/08/01
Artificial intelligence  attention mechanism  encoder-decoder framework  image captioning  multi-modal understanding  training strategies