CASIA OpenIR  > 智能感知与计算
Multimodal channel-wise attention transformer inspired by multisensory integration mechanisms of the brain
Shi, Qianqian1,3; Fan, Junsong2; Wang, Zuoren1,3; Zhang, Zhaoxiang2,3
Source PublicationPATTERN RECOGNITION
ISSN0031-3203
2022-10-01
Volume130Pages:14
Corresponding AuthorWang, Zuoren(zuorenwang@ion.ac.cn) ; Zhang, Zhaoxiang(zhaoxiang.zhang@ia.ac.cn)
AbstractMultisensory integration has attracted intense studies for decades. How to combine visual and auditory information to optimize perception and decision-making is a key question in neuroscience as well as machine learning. Inspired by the mechanisms of multisensory integration in the brain, we propose a multimodal channel-wise attention transformer (MCAT) that performs reliability-weighted integration and revises the weights allocation according to a top-down attention-like mechanism. We apply MCAT on EFLSTM neural networks for a fine-grained video bird recognition task, and on MulT neural networks for an emotion recognition task. The performance of both models is improved remarkably. Ablation study shows that the attention mechanism is indispensable for effective multisensory integration. Moreover, we found that cross-modal integration models are in accordance with the law of inverse effectiveness of multisensory integration in the brain, which reveals that our model may have mechanisms similar to those in the brain. Taken together, the results demonstrate that the brain-inspired MCAT block is effective for improving multisensory integration, providing useful clues for designing new algorithms and understanding multisensory integration in the brain. (C) 2022 Elsevier Ltd. All rights reserved.
KeywordMultisensory integration Top-down attention Multimodal transformer Fine-grained bird recognition Emotion recognition
DOI10.1016/j.patcog.2022.108837
WOS KeywordNEURONAL OSCILLATIONS ; MODEL
Indexed BySCI
Language英语
WOS Research AreaComputer Science ; Engineering
WOS SubjectComputer Science, Artificial Intelligence ; Engineering, Electrical & Electronic
WOS IDWOS:000833526200004
PublisherELSEVIER SCI LTD
Citation statistics
Document Type期刊论文
Identifierhttp://ir.ia.ac.cn/handle/173211/49825
Collection智能感知与计算
Corresponding AuthorWang, Zuoren; Zhang, Zhaoxiang
Affiliation1.Chinese Acad Sci, Ctr Excellence Brain Sci & Intelligence Technol, Inst Neurosci, State Key Lab Neurosci, Shanghai 200031, Peoples R China
2.Chinese Acad Sci, Ctr Res Intelligent Percept & Comp CRIPAC, Inst Automat, Natl Lab Pattern Recognit NLPR, Beijing 100190, Peoples R China
3.Univ Chinese Acad Sci, Sch Future Technol, Beijing 100049, Peoples R China
Corresponding Author AffilicationChinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China
Recommended Citation
GB/T 7714
Shi, Qianqian,Fan, Junsong,Wang, Zuoren,et al. Multimodal channel-wise attention transformer inspired by multisensory integration mechanisms of the brain[J]. PATTERN RECOGNITION,2022,130:14.
APA Shi, Qianqian,Fan, Junsong,Wang, Zuoren,&Zhang, Zhaoxiang.(2022).Multimodal channel-wise attention transformer inspired by multisensory integration mechanisms of the brain.PATTERN RECOGNITION,130,14.
MLA Shi, Qianqian,et al."Multimodal channel-wise attention transformer inspired by multisensory integration mechanisms of the brain".PATTERN RECOGNITION 130(2022):14.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Shi, Qianqian]'s Articles
[Fan, Junsong]'s Articles
[Wang, Zuoren]'s Articles
Baidu academic
Similar articles in Baidu academic
[Shi, Qianqian]'s Articles
[Fan, Junsong]'s Articles
[Wang, Zuoren]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Shi, Qianqian]'s Articles
[Fan, Junsong]'s Articles
[Wang, Zuoren]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.