CASIA OpenIR  > 类脑智能研究中心  > 神经计算与脑机交互
基于fMRI和深度学习的视觉神经信息编码研究
王海保
Subtype硕士
Thesis Advisor何晖光
2020-05-29
Degree Grantor中国科学院大学
Place of Conferral远程答辩
Degree Name工学硕士
Degree Discipline模式识别与智能系统
Keyword深度学习 fMRI 神经信息编解码 感受野
Abstract

几个世纪以来,哲学家和科学家一直在试图揣测、观察、理解和破译大脑是如何运作的,使得人们能够感知和探索复杂的自然世界。其中,作为人类感知世界最重要的信息通道,视觉系统的加工机制得到了研究者最广泛的关注。很多研究尝试利用神经信息编解码的方法,即通过构建外部刺激与神经活动间的定量关系,来揭示大脑视觉信息加工的机制。但由于大脑功能的高度复杂性,要构建精确有效的神经编解码模型仍面临巨大挑战。研究更有效的神经编解码模型,对于探索人脑视觉系统加工机制和开发类脑机器感知模型均具有重要意义。本文基于功能磁共振成像(functional Magnetic Resonance Imaging,fMRI)和深度学习构建了两种神经编码模型,能够有效地提高神经编码的性能和揭示视觉功能区域的编码特性。本文的主要内容和创新点如下:

(1) 本文提出了基于弱先验感受野的深度神经编码模型。过去的基于fMRI的神经编码模型要么依赖于对感受野空间特性的强先验假设,要么依赖于需要人工设置的参数估计方法,这极大地限制了它们的编码能力。为了解决这两个问题,本文提出了一个新的“what”和“where”神经编码框架,将深度神经网络分成特征维度(“what”)与空间维度(“where”)进行学习。在空间维度,本文采用了稀疏和平滑的感受野进行编码。在特征维度,编码的神经网络特征图被同时回归到体素响应。本文将这两个维度的学习统一到端到端的深度框架下,在公开的fMRI数据集上的实验表明该方法比现有的几种方法具有更好的编码性能。

(2) 本文提出了基于多阶段特征细化的深度神经编码模型。为了进一步提高编码性能,本文设计了包含空间感受野和通道注意力的特征细化模块。模型分成三个阶段将自然图像映射成到体素响应,这种多阶段的编码方式能够挖掘不同视觉皮层体素和神经网络的不同特征单元之间的相关性,从而揭示视觉皮层的特征选择性。此外,我们尝试了将该模型扩展到多体素联合编码模型,提供了一种提高低信噪比体素的编码效果的可能方式。在公开的fMRI数据集上的实验表明,该方法提供了一种视觉皮层神经编码的有效策略,具有更优的编码性能。

Other Abstract

For centuries, philosophers and scientists have been trying to speculate, observe, understand, and decipher the workings of the brain that enables people to perceive and explore the complex natural world. As the most important channel for human beings to perceive and understand the world, the processing mechanism of visual system has always been an important open question in neuroscience and artificial intelligence. Many studies have tried to use neural encoding and decoding methods, that is constructing the quantitative relationship between external stimuli and neural activities, to reveal the mechanism of visual information processing in the brain. Due to the high complexity of brain functions, accurately constructing neural encoding models for perceived natural images still remains challenging. It is of great significance for us to explore more effective neural encoding and decoding methods, as it is not only helpful to a more systematic study on the processing mechanism of human visual system, but also conducive to the development of brain-inspired machine intelligence. Based on fMRI and deep learning, this dissertation proposed two kinds of neural encoding models, which can effectively improve the performance of neural encoding and reveal the encoding characteristics of visual areas. The main research contents and contributions of this dissertation are summarized as follows:

(1) Neural encoding with weak prior receptive field and deep neural networks. Previous neural encoding  models rely on either  inflexible prior assumptions about receptive field or clumsy parameter estimation methods, severely limiting their expressiveness. To solve these two problems, this dissertation proposed a “what” and “where” neural encoding framework, which divided the deep neural network into feature dimension (“what”) and spatial dimension (“where”). In the spatial dimension, we adopt the sparsity and smoothness regularizations to guide the receptive field estimation. In the feature dimension, all encoded feature maps of the neural network are regressed onto voxel activities simultaneously. This method can learn “what” and “where” simultaneously in an end-to-end deep neural network, and experiments on the publicly available fMRI dataset demonstrate that our modeling approach achieves superior performance compared with other neural encoding models.  

(2) Neural encoding with multi-stage feature refinement and deep neural networks. To further improve the encoding performance, this dissertation designed a feature refinement module consisting of spatial receptive field and channel attention. The model is divided into three stages to map the natural image to the voxel response, which can mine the relationship between voxels of different visual areas and feature units of the neural network, revealing the feature selection of visual cortex. In addition, we made an attempt to extend the voxel-wise modeling approach to multi-voxel joint encoding models, which provides a possible way to rescue voxels with poor signal-to-noise characteristics. Extensive results on the publicly available fMRI dataset demonstrate that the method developed herein provides an effective strategy to establish neural encoding for human visual cortex, with the better encoding performance.

Pages64
Language中文
Document Type学位论文
Identifierhttp://ir.ia.ac.cn/handle/173211/39156
Collection类脑智能研究中心_神经计算与脑机交互
Recommended Citation
GB/T 7714
王海保. 基于fMRI和深度学习的视觉神经信息编码研究[D]. 远程答辩. 中国科学院大学,2020.
Files in This Item:
File Name/Size DocType Version Access License
基于fMRI和深度学习的视觉神经信息编码(5905KB)学位论文 开放获取CC BY-NC-SAApplication Full Text
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[王海保]'s Articles
Baidu academic
Similar articles in Baidu academic
[王海保]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[王海保]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.