基于多视图深度学习的神经信息编解码研究
杜长德
2019-05-30
页数118
学位类型博士
中文摘要

理解人类大脑的工作原理是 21 世纪最大的挑战之一。虽然可以从不同的角度去研究人脑的工作原理,但其中非常重要的一点是要了解大脑活动模式是如何与感知、认知、情感、记忆等多种脑功能相联系的。尝试利用神经影像数据研究大脑的工作原理及功能的研究被称为神经编解码。神经影像数据可以通过功能磁共振成像 (fMRI)、脑电图 (EEG) 等多种采集技术来采集。先前的研究者利用多变量统计和机器学习方法对采集到的神经影像数据进行分析,已经在神经编解码方面取得了一些进展。但由于人脑功能高度复杂以及神经影像数据样本量少、维度高、模态多样、信噪比低等特点,传统的神经编解码模型仍然效果欠佳。本论文旨在通过采用先进的机器学习技术进一步提高神经编解码的效果,从而促进类脑智能和脑-机接口领域的发展。考虑到该领域的数据特点,我们在多视图学习及深度学习的基础上建立了多种神经解码模型用于克服上述问题。本文的主要研究内容及贡献概括如下:

(1) 提出了用于根据大脑响应重建视觉图像的贝叶斯多视图深度学习模型。在该模型中,我们将视觉图像和与之对应的大脑 fMRI 活动模式看作是不同的视图, 并且让这两个视图共享同一隐含变量。最终,我们将视觉图像重建问题转化为多视图隐含变量模型中缺失视图的贝叶斯推断问题。在多个 fMRI 数据集上的实验表明该方法能够从大脑响应中准确地重建出二值图像、手写字符等视觉内容。

(2) 提出了基于结构化多输出回归和条件生成对抗网络的两阶段神经解码模型。为了进一步提高视觉图像重建的质量,我们首先将大脑 fMRI 活动模式映射到深度神经网络的层次化中间特征,然后根据预测到的深度神经网络特征生成对应的视觉图像。 在设计结构化多输出回归模型时,我们同时挖掘了 fMRI 体素、深度神经网络特征以及多个回归任务之间的相关性。在多个 fMRI 数据集上的实验表明该方法能够更加清晰地重建人脸和自然图像刺激。

(3) 提出了用于解码大脑情绪状态的半监督不完整多视图分类模型。考虑到大脑情绪解码研究中可能面临的带标记数据少、模态有缺失等实际情况,我们的半监督不完整多视图学习模型可以自动地从多种不同的输入模态中学习到高层次的联合表征,这和人类的综合感知系统具有一定的相似性。实验结果表明所提出的方法在解码大脑情绪状态方面比之前的方法有明显的效果提升。

(4) 提出了用于多种神经解码任务的双重半监督多视图对抗学习模型。通过建立新的多视图对抗学习模型,我们将大脑活动模式的语义解码和图像重建任务统一在了同一个框架下。 所提出的算法将脑活动模式的语义预测及视觉图像的重建协同训练,使得解码结果语义明确,视觉质量清晰。此外,新算法还允许我们在共享的隐含空间中根据脑活动模式快速检索对应的视觉刺激。

英文摘要

Understanding how the human brain works is one of the biggest challenges of the 21st century. Although there are different ways to understand the human brain, the key is to find out how brain activity patterns are related to various brain functions such as perception, cognition, emotion and memory. The study of how the brain works and functions using neuroimaging data are neural encoding and decoding. Neuroimaging data can be collected by functional magnetic resonance imaging (fMRI), Electroencephalogram (EEG) and other techniques. Previous researchers using multivariate statistics and machine learning methods to analyze the collected neuroimaging data have made some progress in this filed. However, due to the high complexity of human brain, the small sample size of neural image data, the high data dimensionality, the diverse modalities and the low signal-to-noise ratio, the traditional methods are less effective. This dissertation aims to improve the effect of neural encoding and decoding by using advanced machine learning technologies, so as to promote the development of brain-like intelligence and brain-computer interface (BCI).  Considering the data characteristics in this field, we build a variety of decoding models under the multi-view learning and deep learning framework to overcome the above problems. The main research contents and contributions of this dissertation are summarized as follows:

(1) We proposed a Bayesian multi-view deep learning model for reconstructing the percieved images from brain activities. In this model, we treat the visual images and their corresponding fMRI activity patterns as different views, which share the same latent variables. Finally, we transform the visual image reconstruction problem into a Bayesian inference problem of missing views in the multi-view latent variable model. Experiments on multiple fMRI datasets show that this method can accurately reconstruct binary images, handwritten characters and other visual contents from brain responses.

(2) We proposed a two-stage neural decoding model based on structured multi-output regression and conditional generative adversarial network. First, a structured multi-output regression model is designed to map the fMRI activity patterns to the hierarchically intermediate features of a pretrained deep neural network (DNN). Then a conditional generative adversarial network is established to generate the corresponding visual images conditioned on the predicted DNN features. In the design of regression model, we simultaneously leverage the correlations between fMRI voxels, DNN features and regression tasks. Experiments on multiple fMRI datasets show that this method can reconstruct face and natural images more clearly than the previous methods.

(3) We proposed a semi-supervised incomplete multi-view classification model for decoding emotional state of the brain. In consideration of the actual situations of labeled-data-scarcity and missing modalities that may be faced in the research of brain emotion decoding, our semi-supervised incomplete multi-view learning framework can automatically learn high level joint representation from many different input modalities, which is similar to the integrated human perception system. Experimental results show that the proposed method is significantly more effective than the previous method in decoding emotional state of the brain.

(4) We proposed a doubly semi-supervised multi-view adversarial learning model to unify the semantic decoding and the image reconstruction tasks into one framework. The proposed multi-view adversarial learning algorithm combines the training processes of semantic prediction of brain activity patterns and the reconstruction of visual images together to make the decoding results has clear semantics and high visual quality. The new algorithm also allows us to fast retrieve visual stimuli based on given brain activity patterns.

关键词深度学习 神经编解码 生成式模型 多视图学习 贝叶斯方法
语种中文
七大方向——子方向分类医学影像处理与分析
文献类型学位论文
条目标识符http://ir.ia.ac.cn/handle/173211/23869
专题脑图谱与类脑智能实验室_神经计算与脑机交互
通讯作者杜长德
推荐引用方式
GB/T 7714
杜长德. 基于多视图深度学习的神经信息编解码研究[D]. 中国科学院自动化研究所. 中国科学院自动化研究所,2019.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
基于多视图深度学习的神经信息编解码研究.(12885KB)学位论文 开放获取CC BY-NC-SA
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[杜长德]的文章
百度学术
百度学术中相似的文章
[杜长德]的文章
必应学术
必应学术中相似的文章
[杜长德]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。