CASIA OpenIR

浏览/检索结果: 共22条,第1-10条 帮助

限定条件                            
已选(0)清除 条数/页:   排序方式:
Modal Contrastive Learning Based End-to-End Text Image Machine Translation 期刊论文
IEEE/ACM Transactions on Audio, Speech, and Language Processing (IEEE/ACM TASLP), 2023, 卷号: 32, 期号: 32, 页码: 2153-2165
作者:  Ma, Cong;  Han, Xu;  Wu, Linghui;  Zhang, Yaping;  Zhao, Yang;  Zhou, Yu;  Zong, Chengqing
Adobe PDF(6551Kb)  |  收藏  |  浏览/下载:32/17  |  提交时间:2024/06/26
Transformers  Machine translation  Decoding  Semantics  Pipelines  Text recognition  Task analysis  Text image machine translation  contrastive learning  text image recognition  machine translation  
Efficient multimodal transformer with dual-level feature restoration for robust multimodal sentiment analysis 期刊论文
IEEE Transactions on Affective Computing, 2023, 卷号: 15, 期号: 1, 页码: 1-17
作者:  Licai Sun;  Zheng Lian;  Bin Liu;  Jianhua Tao
Adobe PDF(2371Kb)  |  收藏  |  浏览/下载:66/18  |  提交时间:2024/05/31
Transformers  Robustness  Semantics  Data models  Computational modeling  Videos  Training  Multimodal sentiment analysis  unaligned and incomplete data  efficient multimodal Transformer  dual-level feature restoration  robustness  
Hierarchical Attention Network for Open-Set Fine-Grained Recognition 期刊论文
IEEE Transactions on Circuits and Systems for Video Technology, 2023, 页码: 1-14
作者:  Jiayin, Sun;  Hong, Wang;  Qiulei, Dong
Adobe PDF(2596Kb)  |  收藏  |  浏览/下载:59/18  |  提交时间:2024/05/28
Towards Prior Gap and Representation Gap for Long-tailed Recognition, Pattern Recognition 期刊论文
Pattern Recognition, 2023, 卷号: 133, 期号: 109012, 页码: 109012
作者:  Zhang Ming-Liang;  Zhang Xu-Yao;  Wang Chang;  Liu Cheng-Lin
Adobe PDF(2258Kb)  |  收藏  |  浏览/下载:126/31  |  提交时间:2024/04/03
Long-tailed learning  Prior gap  Representation gap  Image recognition  
AnANet: Association and Alignment Network for Modeling Implicit Relevance in Cross-Modal Correlation Classification 期刊论文
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 卷号: 25, 页码: 7867-7880
作者:  Xu, Nan;  Wang, Junyan;  Tian, Yuan;  Zhang, Ruike;  Mao, Wenji
收藏  |  浏览/下载:53/0  |  提交时间:2024/03/26
Association and alignment network  classification scheme  cross-modal correlation  implicit relevance  
Multi-Correlation Siamese Transformer Network With Dense Connection for 3D Single Object Tracking 期刊论文
IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 卷号: 8, 期号: 12, 页码: 8066-8073
作者:  Feng, Shihao;  Liang, Pengpeng;  Gao, Jin;  Cheng, Erkang
Adobe PDF(2745Kb)  |  收藏  |  浏览/下载:130/9  |  提交时间:2023/12/21
3D object tracking  Point cloud  Transformer  
SignParser: An End-to-End Framework for Traffic Sign Understanding 期刊论文
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2023, 卷号: 132, 期号: 2, 页码: 805-821
作者:  Guo, Yunfei;  Feng, Wei;  Yin, Fei;  Liu, Cheng-Lin
Adobe PDF(7011Kb)  |  收藏  |  浏览/下载:133/7  |  提交时间:2023/12/21
Traffic sign understanding  Content reasoning  Semantic description generation  
Reducing Vision-Answer Biases for Multiple-Choice VQA 期刊论文
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 卷号: 32, 页码: 4621-4634
作者:  Zhang, Xi;  Zhang, Feifei;  Xu, Changsheng
Adobe PDF(2684Kb)  |  收藏  |  浏览/下载:94/7  |  提交时间:2023/11/17
Multiple-choice VQA  vision-answer bias  causal intervention  counterfactual interaction learning  
Covariance Estimation for Pose Graph Optimization in Visual-Inertial Navigation Systems 期刊论文
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2023, 卷号: 8, 期号: 6, 页码: 3657-3667
作者:  Shi, Pengcheng;  Zhu, Zhikai;  Sun, Shiying;  Rong, Zheng;  Zhao, Xiaoguang;  Tan, Min
Adobe PDF(2522Kb)  |  收藏  |  浏览/下载:136/24  |  提交时间:2023/11/17
covariance estimation  loop closing  pose graph optimization  visual-inertial odometry  
GCNet: Graph Completion Network for Incomplete Multimodal Learning in Conversation 期刊论文
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 卷号: 45, 期号: 7, 页码: 8419-8432
作者:  Lian, Zheng;  Chen, Lan;  Sun, Licai;  Liu, Bin;  Tao, Jianhua
Adobe PDF(3959Kb)  |  收藏  |  浏览/下载:184/9  |  提交时间:2023/11/17
Oral communication  Correlation  Data models  Task analysis  Feature extraction  Tensors  Benchmark testing  Conversational data  graph complete network (GCNet)  incomplete multimodal learning  speaker-sensitive modeling  temporal-sensitive modeling