视觉信息是人类感知、理解和认识外部世界的最重要信息来源之一。近些年来，视觉神经信息解码领域引起了广泛的关注。先前的研究通常采用系统辨识的方法，通过研究脑活动与人脑接收视觉刺激之间的关系对大脑的视觉信息进行解码，从而推断大脑的功能机制。得益于神经信号采集技术（功能磁共振成像 （Functional Magnetic Resonance Imaging，fMRI）和脑电图 （Electroencephalography，EEG））的进步和人工智能技术的高速发展，通过非侵入手段对视觉神经信息解码变得切实可行，并取得了一定的进展。
Visual information is one of the most important information sources for human beings to perceive and understand the external world. In recent years, the visual neural information decoding field has attracted extensive attention. Previous studies usually use the system identification method to decode the information of the brain by studying the relationship between brain activity and visual stimulation received by the human brain, so as to infer the functional mechanism of the brain. Thanks to the progress of neural signal acquisition technology (Functional Magnetic Resonance Imaging (fMRI) and Electroencephalography (EEG)) and the rapid development of artificial intelligence technology, it is feasible to decode the visual neural information by non-invasive means, and some progress has been made. However, due to the heterogeneity between brain signal modality and stimulus image modality, the brain modality missing, the single decoding information, the large differences between subjects and the small sample size of neural data, it is difficult for the existing methods to decode the image information by using brain signals. Therefore, the study of more effective visual neural information decoding methods not only promote people's understanding of visual processing mechanism, but also provide a new perspective for the development of brain like machine perception methods. For this purpose, based on multi-modal deep learning, this dissertation carries out the research of visual information decoding method. On the premise of improving the accuracy of visual information decoding, it effectively overcomes the problems of strong heterogeneity, single information, modality missing, the small sample size and so on in the visual neural information decoding field. The main contents and innovations of this dissertation are as follows:
(1) In order to solve the problems of the strong heterogeneity of brain signal and image modality, the strong dependence of paired data and the insufficient utilization of information, a semi-supervised generative adversarial method for visual reconstruction task is proposed. This method unifies the semantic decoding and image reconstruction tasks of brain activity, and uses semantic information as a bridge between brain signal modality and image modality, so as to overcome the strong heterogeneity of different modalities. Secondly, the method makes full use of unpaired image data to mine deep semantic information to assist cross-modal image generation task, which greatly reduces the dependence of the method on paired data. The results show that the proposed semi-supervised generative adversarial method can generate images with high accuracy and clear semantic information.
(2) In order to solve the problems of single decoded image information and far away from natural scene, a multi-label semantic prediction analysis method based on semi-supervised co-training in natural scene is proposed. The method combines co-training network and symmetrical semantic feature translators to decode the brain signal induced by natural scenes, and uses image modality to assist brain signal modality in multi-label learning. At the same time, the method overcomes the problems of insufficient label samples and brain signal modality missing, and improves the decoding accuracy of brain signal. The results of multiple independent datasets show that the proposed semi- supervised co-training multi-label semantic prediction method can greatly improve the decoding accuracy of brain signal to image multi-label semantic information.
(3) In order to solve the problem of small sample size in existing fMRI dataset, this dissertation proposes a multi-subject fMRI data augmentation method with multi-modal adversarial learning. Based on subspace and multi-modal adversarial learning, this method solves the problem of differences between subjects, which can greatly improve the decoding accuracy of single subject by combining a small number of data of target subjects and the data of other subjects. It provides a new insight into the problem of brain decoding. The analysis results on several independent datasets show that the proposed multi-subject data augmentation method with multi-modal adversarial learning improves the semantic decoding accuracy of single subject.