CASIA OpenIR  > 毕业生  > 博士学位论文
细粒度情绪神经编解码研究及应用
付铠城
2024-05-13
Pages150
Subtype博士
Abstract

情绪神经编解码能够建模情绪特征和人脑神经响应之间的关系,是研究人类情绪的重要方式,对于理解大脑的情感处理机制具有重要作用。随着心理学的发展,在基本情绪类别的基础上,研究人员发现了人类可以感知到更加细粒度的情绪种类。先前的研究在细粒度情绪的大脑表征和解码方面仍然存在不足。为此,本文开展了细粒度情绪神经编解码及其应用于脑指导情绪识别模型构建的研究,主要研究内容及创新点如下:

(1)针对细粒度情绪大脑表征及其可解释性的问题,本文利用视觉诱发情绪的fMRI数据,基于神经编码模型开发了细粒度情绪大脑表征的分析框架。通过对编码模型权重进行主成分分析,获得了本征情感空间,并将该情感空间的维度和数据集中标注的情感维度进行了相关性分析。研究发现,人类感知到的细粒度情绪通过本征情感空间在大脑皮层上形成分布式的表征模式,该情感空间的每个维度与唤起度等已知的情感维度存在映射关系。研究进一步通过皮层可视化方法为人脑构建了细粒度情绪地图,发现细粒度的情绪类别在大脑皮层上呈现出平滑的梯度组织形式,且上述本征情感空间和情绪梯度均具有较强的跨被试一致性。该研究为细粒度情绪大脑表征的研究提供了新思路。

(2)针对情绪解码模型仅考虑了粗粒度且单标签的情绪类别问题,本文提出了一种多视图多标签混合模型。为了在解码中结合人类情绪感知时大脑左右半球响应差异的先验知识,模型的生成部分将左脑信号、右脑信号和左右脑信号的差异看作多个视图,利用基于专家相乘机制的多视图变分自编码器提取多视图神经表征。为了建模情绪标签间的相关性用于多标签学习,模型的判别部分利用标签感知模块和掩码自注意力机制提取表达能力更强的情绪特定神经表征。在两个视觉诱发情绪的fMRI数据集上的实验验证了本方法在细粒度多标签情绪解码问题上的优越性。

(3)针对情绪解码模型难以适应细粒度情绪类别逐渐增多的动态场景问题,本文开发了一种可增广情绪语义学习算法。首先,针对过去部分标签缺失造成的灾难性遗忘,本方法设计了带有标签消歧的可增广情绪关系图模块,用于整合重要的历史情绪标签相关性信息;接着,针对未来部分标签缺失造成的灾难性遗忘,本方法利用情感维度空间中的领域知识,与情绪类别模型形成互补,通过基于样本关系的知识蒸馏算法对齐情感维度空间和模型特征空间。此外,为了获得多标签学习重要的语义特定特征,本方法利用由图自编码器构成的情绪语义学习模块,获取情绪语义标签嵌入,用于指导后续语义特定的特征解耦。在fMRI数据集和多媒体数据集以及多种增量协议上的实验说明了本方法在多标签类增量情绪解码问题上的有效性。

(4)针对深度学习模型与人类的情感理解能力存在差距的问题,本文提出利用脑信号指导模型学习,将人类的情绪感知能力迁移到现实场景的情绪识别模型中,提升机器的情感智能。为此,本文利用视觉诱发情绪的fMRI数据为卷积神经网络开发了基于表征相似性分析的联合训练框架,对齐人脑和神经网络的表征。此外,本方法通过上述神经编码模型进行了情绪相关体素选择和数据降噪,用于后续的联合训练。在两个用户生成视频数据集和多个卷积神经网络结构上的实验证明了人类脑活动可以为CNNs提供额外的归纳偏置,增强模型的类脑特性和情绪识别性能。

Other Abstract

Emotional neural encoding and decoding is able to model the relationship between emotion features and human brain activity, playing a crucial role in understanding the brain's emotional processing mechanisms. With the development of psychology, researchers have discovered that human can perceive more fine-grained emotion categories beyond basic emotion categories. Previous studies still have limitations in the brain representation and decoding of fine-grained emotions. Therefore, this thesis conducts research on fine-grained emotional neural encoding and decoding, and its application to construct brain-guided emotion recognition models. The main research contents and novelties are as follows:

(1) To address the issue of fine-grained emotion brain representation and its interpretability, an analysis framework for fine-grained emotion brain representation based on neural encoding model using fMRI data induced by visual emotional stimuli is developed. By conducting principal component analysis on the encoding model weights, a fundamental affective space is obtained. The correlation between the dimensions of this affective space and the annotated affective dimensions in the dataset is analyzed. The study finds that fine-grained emotions perceived by human are distributed across cerebral cortex through the fundamental affective space, with each dimension of the affective space mapping to known affective dimensions (e.g., arousal). Furthermore, through cortical visualization methods, this study constructs a fine-grained emotion map for human brain, revealing a smooth gradient organization of fine-grained emotion categories on the cerebral cortex. The fundamental affective space and emotion gradients both exhibit strong inter-subject consistency. This study provides new insights into fine-grained emotion brain representation.

(2) To address the issue of emotional decoding models considering only coarse-grained and single-label emotion categories, a multi-view multi-label hybrid model is proposed. To integrate prior knowledge of the discrepancy between the left and right hemispheres during emotional perception, the generative component of the model considers the left brain activity, right brain activity, and the difference between them as multiple views. A multi-view variational autoencoder based on product of experts mechanism is used to extract multi-view neural representations. To model the correlations between emotion labels for multi-label learning, the discriminative component of the model utilizes a label-aware module and masked self-attention mechanism to extract more expressive emotion-specific neural representations. Experiments on two fMRI datasets induced by visual emotional stimuli validate the superiority of this method in fine-grained multi-label emotion decoding.

(3) To address the challenge of adapting emotional decoding models to the increasing number of fine-grained emotion categories in dynamic scenarios, an augmented emotional semantics learning algorithm is developed. Firstly, to address the catastrophic forgetting caused by past-missing partial label problem, an augmented emotional relation graph module with label disambiguation is designed to integrate important historical emotional label correlation information. Secondly, to tackle the catastrophic forgetting caused by future-missing partial label problem, this study utilizes domain knowledge in affective dimension space to complement the emotion category model. Through relation-based knowledge distillation, affective dimension space and model feature space are aligned. In addition, in order to obtain important semantic-specific features for multi-label learning, an emotional semantics learning module composed of a graph autoencoder is exploited to obtain emotional semantics label embeddings for guiding subsequent semantic-specific feature decoupling. Experiments on fMRI datasets, multimedia datasets, and various incremental protocols demonstrate the effectiveness of this method in multi-label class incremental emotion decoding.

(4) To bridge the gap of emotional understanding capabilities between deep learning models and human, this study introduces brain activity to guide model learning and transfer human emotional perception capabilities to real-world emotion recognition models, enhancing the emotional intelligence of machines. To achieve this, a joint training framework based on representation similarity analysis is developed for CNNs using fMRI data induced by visual emotional stimuli to align the representations of human brain and neural networks. Furthermore, through the aforementioned neural encoding model, emotion-related voxel selection and data denoising are conducted for subsequent joint training. Experiments on two user-generated video datasets and multiple CNNs demonstrate that human brain activity can provide additional inductive bias to CNNs, enhancing the brain-like properties and improving emotion recognition performance.

Keyword细粒度情绪神经编解码 类脑情绪识别 多视图学习 多标签学习 类增量学习 表征相似性分析
Language中文
IS Representative Paper
Sub direction classification人工智能+科学
planning direction of the national heavy laboratoryAI For Science
Paper associated data
Document Type学位论文
Identifierhttp://ir.ia.ac.cn/handle/173211/57176
Collection毕业生_博士学位论文
Recommended Citation
GB/T 7714
付铠城. 细粒度情绪神经编解码研究及应用[D],2024.
Files in This Item:
File Name/Size DocType Version Access License
博士学位论文_明版.pdf(20567KB)学位论文 限制开放CC BY-NC-SA
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[付铠城]'s Articles
Baidu academic
Similar articles in Baidu academic
[付铠城]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[付铠城]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.