基于分布式文本表示的神经编解码方法研究

CASIA OpenIR > 多模态人工智能系统全国重点实验室 > 自然语言处理

	基于分布式文本表示的神经编解码方法研究
	孙静远
	2021-05-28
页数	122
学位类型	博士
中文摘要	人脑将感官接受的语言信号转化为神经元活动的过程就是在建立语言的神经表征，这是人类理解语言最基础也是最重要的步骤。本文研究脑语言表征的神经编码和解码，这是解析大脑语言认知功能、实现语言的脑机接口、启发类脑语言计算模型的关键，具有重要的理论意义和应用价值。神经编码和神经解码都需要通过某种方式建立语言信号和大脑神经表征之间的关联。不同的是，神经编码重点关注如何解析大脑的语言表征机理，即研究不同类型的语言信息在大脑中的加工过程和所涉及的脑网络等问题。而神经解码重点关注如何解析神经信号中所蕴含的语义信息，从而实现思维解读的目的。实现上述神经编、解码的关键环节是采用何种方式建立语言信号的表示。为此，本文提出了一种连续文本表示学习模型。不同于以往在特定任务中学习文本表示的方法，该模型能够连续在多个自然语言理解任务中进行训练，从而得到具有更强通用性的文本表示。接着，在上述模型及十余种分布式表示方法的基础上，本文建立了由粗到细多个粒度的神经编码、解码模型，探讨了语言的计算表示和神经表征之间的关系。最后，由于基于分布式文本表示的神经编、解码模型的可解释性较差，本文分别针对神经编码、解码提出了解释方法，促进了对人脑的部分语言表征机制的理解。论文的主要工作和创新点归纳如下： 1. 提出了一种基于知识蒸馏和生成复现的连续文本表示学习模型主流的分布式表示模型在学习与旧训练语料的数据分布差异较大的新语料时，倾向于遗忘在旧语料中获得的知识。这使得表示模型在连续学习不同的任务时，一旦拟合了新任务，在旧任务上的表现容易出现显著的下跌。针对此问题，本文提出了一种基于知识蒸馏和生成复现的连续文本表示学习模型，该模型可以在解决新任务时高效地提炼并复用在旧数据中学得的知识，进而能够基于同一个基础网络连续地在不同任务中进行文本表示学习。实验表明该模型可学习在新、旧数据上通用的文本表示，且其抗遗忘表现与已有的方法相比具有明显的优势。以上工作补足了现存的主流分布式文本表示模型在连续学习上的不足，也为本文建立基于分布式文本表示的神经编、解码模型，探究文本的计算表示与脑神经表征的关系奠定了更好的基础。 2. 系统研究了多种分布式文本表示在神经编码和解码中的应用传统的神经编码和解码研究中，用人工特征表示文本仍是主流。人工特征构建代价较高，并且无法充分覆盖自然语言庞大的组合语义空间。分布式文本表示方法可以从语料中自动学习到文本的向量表示，缓解了上述人工特征的局限。然而，目前分布式表示在神经编码和解码研究中的应用还相当有限，无法确定何种表示方法更适合预测和解析人脑的语言表征。针对此问题，本文基于十三种分布式文本表示分别建立了神经编码和解码模型。其中，包括本文提出的连续文本表示模型在内的多种有监督表示模型是首次在神经编、解码中得到应用。通过实验，我们系统地对比评估了表示模型在不同脑区、脑功能网络上的编、解码表现，选定了在不同粒度下都可以实现准确的神经编、解码的表示模型。通过分析神经编解、码的实验结果，我们还发现主题概念的神经表征分布在大脑皮层的多个脑区中。 3. 提出了一种基于探针任务和消融测试的神经编码解释方法已有研究及本文实验均证实，基于分布式文本表示的神经编码器能够准确地预测语言刺激引发的神经活动。但是，我们无法解释文本表示捕获的何种语言特征对神经编码的准确率贡献最大。针对此问题，本文提出了一种神经编码模型的解释方法。该方法通过探针任务来分析表示模型建模不同种语言特征的能力，然后通过消融测试来验证建模某种语言特征的能力是否有助于表示模型进行神经编码。实验表明，分布式文本表示建模语义关系的能力有助于其预测脑语言网络中大部分脑区的神经活动，而建模句法结构的能力则对预测脑语言网络中小部分脑区有帮助。 4. 提出了一种基于稀疏表示和门控网络的可解释神经解码模型已有研究及本文实验均证实，基于分布式文本表示的神经解码器能够在一定程度上解析语言刺激引发的神经活动。但是，我们并不清楚解码器从脑神经活动中解析出了哪些语义信息。针对此问题，本文提出了一种可解释的神经解码模型，该模型使用稀疏化方法处理分布式文本表示，令处理后向量的每一维度上的数值具备可解释性。另外，该模型以我们提出的门控网络作为基础架构，可以观测对于解析脑神经活动更为重要的网络单元。实验表明我们提出的方法在解码表现上明显超出其他基于稀疏文本表示的方法，并且能够解析出分布式表示和神经活动共享的语义信息，使我们进一步了解了大脑的语言表征机理。
英文摘要	The process by which the human brain converts the language signals received by the sense organs into neuronal activities is to establish the neural representation of language, which is the most basic and most important step for humans to understand language. This thesis uses neural encoding and decoding methods to study the neural representation of language signals. This is the key to analyzing brain language cognitive functions, realizing brain-computer interfaces of language, and inspiring brain-like language computing models. It has important theoretical significance and application value. Both neural encoding and neural decoding calculate the correlation between language signals and neural representations of the brain in some way. The difference is that neural encoding focuses on how to analyze the brain's language representation mechanism, that is, to study the processing of different types of language information in the brain and the brain networks involved. And neural decoding focuses on how to analyze the semantic information contained in neural signals, so as to achieve the purpose of brain reading. The key to realizing the above-mentioned neural encoding and decoding is how to establish the representation of language signals. To this end, this thesis proposes a continual text representation model. Different from the previous methods of learning text representation in specific tasks, this model can be continuously trained in multiple natural language understanding tasks, so as to obtain a more universal text representation. Next, on the basis of the above model and more than ten distributed representation methods, this thesis establishes neural encoding and decoding models with multiple granularities ranging from coarse to fine, and explores the relationship between the computational representation and neural representation of language. Finally, due to the poor interpretability of neural encoding and decoding models based on distributed text representation, this thesis proposes interpretation methods for neural encoding and decoding respectively, which promotes the understanding of language representation mechanism of the human brain. The main work and innovations of the thesis can be summarized as follows: 1. We propose a continual text representation learning model based on knowledge distillation and generative replay. The mainstream distributed representation model tends to forget the knowledge acquired in the old corpus when learning a new corpus with a large difference in data distribution from the old one. This makes the representation model prone to a significant drop in performance on the old task once it is fitted to the new task when it continually learns different tasks. To solve this problem, this thesis proposes a continual text representation learning model based on knowledge distillation and generative replay. This model can efficiently refine and reuse the knowledge learned in old data when solving new tasks. It can thus use the same base network to continually perform text representation learning in different tasks. Experiments show that the model can learn universal text representations on new and old data, and its anti-forgetting performance has obvious advantages compared with existing methods. The above work complements the existing mainstream distributed text representation models for continual learning. It also provides a better way for us to establish distributed text representation-based neural encoding and decoding systems to explore the relationship between the computational representation and the brain neural representation of language. 2. We systematically study the application of a variety of distributed text representations in neural encoding and decoding. In traditional neural encoding and decoding research, the use of hand-crafted features to represent text is still the mainstream. The hand-crafted features are costly to construct and cannot fully cover the huge compositional semantic space of natural language. The distributed text representation method can automatically learn the vector representation of the text from the corpus, alleviating the limitations of the above-mentioned hand-crafted features. However, the application of distributed representation in the research of neural encoding and decoding is still quite limited, and it is hard to determine which representation method is more suitable for predicting and analyzing the language representation of the human brain. In response to this problem, this thesis establishes neural encoding and decoding models based on thirteen distributed text representations, and systematically compares and evaluates their encoding and decoding performance in different brain regions and brain function networks. For a variety of representation models including the continual text representation model proposed in this article, it is the first time for them to be used in neural encoding and decoding. Through experiments, we have found with what representation models accurate neural encoding and decoding can be achieved at different granularities. 3. We propose a neural encoding interpretation method based on probing task and ablation test. Existing studies and experiments in this thesis have confirmed that the neural encoder based on distributed text representation can accurately predict the neural activity triggered by language stimuli. However, we cannot clearly explain which linguistic features that the model capture contributes the most to the accurate neural encoding. In response to this problem, this thesis proposes an interpretation method for neural encoding models. This method analyzes the ability of the representation model to capture different linguistic features through the probing task, and then verifies whether the ability to capture a certain feature is helpful for neural encoding through ablation tests. Experiments show that the ability of text representation to model semantic relations helps it predict the neural activity of most brain areas in the brain language network, while the ability to capture syntactic structure is helpful to predict a small part of brain areas in the brain language network. 4. We propose a neural decoding interpretation method based on sparse representation and gated network. Existing research and experiments in this thesis have confirmed that a neural decoder based on distributed text representation can analyze neural activities triggered by language stimuli. However, we don't know what semantic information the decoder infers from brain neural activity. In response to this problem, this thesis proposes an interpretable neural decoder, which uses a sparse method to process distributed text representations, so that the value of each dimension of the processed vector has a specific and solvable meaning. The model also uses a gated network as the basic structure that can reflect which network units are more important for analyzing brain neural activities. Experiments show that the decoding performance of our proposed method is significantly better than other methods based on sparse text representations, and it can parse the semantic information shared by distributed representations and neural activities. This helps us to further understand the content and structure of brain language representations.Experiments show that the decoding performance of our proposed method is significantly better than other methods based on sparse text representations, and it can parse the semantic information shared by distributed representations and neural activities. This helps us to further understand the content and structure of brain language representations.
关键词	分布式文本表示，神经语言表征，神经编码，神经解码
语种	中文
七大方向——子方向分类	自然语言处理
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/45006
专题	多模态人工智能系统全国重点实验室_自然语言处理
推荐引用方式 GB/T 7714	孙静远. 基于分布式文本表示的神经编解码方法研究[D]. 中国科学院自动化研究所. 中国科学院自动化研究所,2021.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
孙静远博士学位论文终版.pdf（10254KB）	学位论文		开放获取	CC BY-NC-SA