CASIA OpenIR  > 毕业生  > 博士学位论文
面向样本缺失场景的情绪与压力状态评估方法研究
武金婷
2022-08
Pages130
Subtype博士
Abstract

随着社会的发展和进步,心理健康问题逐渐引起社会的广泛关注。情绪和压力状态是与心理健康密切相关的重要因素。负面情绪和过度压力会对人的认知与决策造成负面效果,例如易于影响医生、驾驶员等特定职业人群的工作状态,进而带来事故隐患。持续处于负面状态甚至会严重损害身体健康或导致精神疾病。因此,面向情绪和压力状态的自动评估方法与系统具有重要的研究意义和应用价值。近年来,随着深度学习理论与技术的发展,许多研究工作将深度网络引入情绪或压力评估领域,这些方法需要在实验室环境下采集大量数据用于网络模型训练。考虑到数据采集环境对应用场景的限制问题,利用采集环境贴近自然状况、采集方法简便易行的肢体动作数据和外周生理信号数据进行情绪和压力状态评估的方法受到了广泛关注。此外,在情绪与压力相关的数据采集过程中,不同状态的诱发较为困难,因此很难获取充足的训练样本。这可能会导致多种类型的训练样本缺失场景的出现,如类别缺失、特定群体数据不足和待测个体数据不足等。本文针对肢体动作数据和外周生理信号数据进行分析,并且针对上述两种数据的样本缺失问题,提出了三种针对情绪或压力的评估算法,旨在提升算法在有限数据资源上的准确性和实用性。在上述算法的基础上,借助两种数据的互补关系进一步构建了一个基于双模态数据的异常心理状态预警系统来进行综合评估。本文的主要工作和贡献如下:

(1)针对现有研究在小规模、少类别数据集上训练的模型难以应用于更为多样化的情绪相关肢体动作的问题,提出了一种基于广义零样本学习(Generalized Zero-Shot Learning,GZSL)的肢体动作情绪识别方法,即借助人工设计的语义表示来实现对未经训练的类别的预测。为充分利用肢体动作信息,该方法将情绪类别视为多个肢体动作的集合,并据此构建了可同时满足动作标签和情绪标签双重约束的网络结构。具体地,本文提出了包含分层原型网络(Hierarchical Prototype Network,HPN)和语义自动编码器(Semantic Auto-Encoder,SAE)双分支的广义零样本学习网络,分别用于预测已训练和未训练类别的样本。其中,分层原型网络分支借助情绪和动作的先验关系,依次学习动作和情绪两个层次的原型中心,以增强动作类别的可分性和情绪类别的类内相似性;语义自动编码器分支通过学习从特征空间到语义空间的映射关系,借助包含情绪和动作信息的语义表示实现对未训练类别样本的预测。在MASR公共数据集上的实验结果表明了所提方法优于现有的GZSL算法和仅使用单种标签的基线方法。

(2)针对仅具有少量标注样本的目标群体与可获取大量训练数据的普通群体之间存在数据分布差异的问题,提出了一种面向生理信号压力检测任务的域混合对抗迁移学习(Adversarial Transfer Learning with Domain Mixup)算法。该算法通过学习与群体差异无关的特征,实现了由普通人群到目标群体的压力检测知识迁移。具体地,在包含特征提取器、域判别器和压力检测器的对抗迁移网络的基础上,构建了特征层面的域混合样本来增强域判别器的泛化性。其中,特征提取器与域判别器的优化目标相互对抗,能够学习具有领域不变性的特征;三个模块共同训练,使得对抗学习生成的特征不会对压力检测任务的性能产生负面影响。在此基础上,针对压力检测任务中标签分布不均匀的问题,提出了基于类别先验概率估计的损失函数修正方法,有利于提高训练样本较少的高压力类别的识别性能。因目前仍没有针对特定目标群体的公共数据集,本文构建了一个包含普通人群和警校学生群体的生理信号数据集进行实验验证,结果表明了所提算法优于非迁移学习基线算法和具有代表性的迁移学习算法。

(3)针对现有个性化模型训练通常需要获取待测被试(Subject)大量样本的问题,提出了一种基于孪生网络(Siamese Network)的个体情绪与压力检测方法。该方法通过构建相对强度回归模型来对样本对之间的相对差异性进行建模,可仅采用单个有标签的基线样本实现个体校准。在此基础上,为了更好地利用通过数据分割得到的片段样本进行网络训练,构造了一个新的强度排序子任务来对回归任务进行辅助,并针对生理信号数据的特性设计了排序规则和排序样本对构建方法。该强度排序子任务借助片段样本对的相对强弱监督信息进行逐对排序(Pairwise Ranking),以增强所提特征对于情绪或压力相对强度的表征能力。上述两个子任务共用孪生网络提取深层特征,并采用循环交替的方式进行训练。在自采集压力数据集和DEAP公共情绪数据集上的实验结果表明,该方法优于单样本校准的基线方法,且与现有研究中使用单被试的多个样本训练的个性化模型性能相近。

在上述压力与情绪状态检测算法的基础上,进一步构建了双模态异常心理状态预警系统进行实验验证。该系统基于肢体动作和生理信号的互补性,可实现对负面情绪和过度压力等异常状态的综合评估与预警。为验证所构建系统的性能,采用视频素材来诱发被试不同的心理反应,并构建包含肢体动作和生理信号的双模态数据集。在该数据集上的实验结果表明了所提预警系统可有效运行,且性能明显优于针对单模态数据的算法。

Other Abstract

With the development and progress of society, the mental health issues have received widespread social attention. Emotions and stress states are important factors which are closely related to this issue. Negative emotions and excessive stress states have negative effects on people's cognition and decision-making. For example, it is easy to affect the working status of specific professional groups such as doctors and drivers, which will lead to hidden dangers of accidents. Persistent negative states may even seriously damage physical health or lead to mental illness. Therefore, automatic monitoring methods and systems for emotions or stress states have important research significance and application value. Recently, with the development of deep learning theory and technology, many research works have applied deep networks to the field of emotion or stress detection. These methods require a large amount of data to be collected in a laboratory environment for network model training. Considering the limitation of the data collection environment on the application scenes, the collection environment of body gestures and peripheral physiological signals is close to natural conditions, and their collection methods are simple and easy to implement. Therefore, the methods of assessing emotions and stress states using body gestures and peripheral physiological signals have received extensive attention. In addition, in the process of data collection, it is difficult to induce different states, and it is difficult to collect sufficient training samples. This may lead to the emergence of various types of missing training samples, such as missing categories, insufficient data for specific groups, and insufficient data for test individuals. This thesis focus on body gestures and peripheral physiological signals. To solve the problem of missing samples of the above data, three emotion or stress detection algorithms are proposed to improve the accuracy and practicability of the algorithm on limited data resources. On the basis of the above algorithms, with the help of the complementary relationship between the two kinds of data, a dual-modal system for the warning of abnormal mental states is further constructed for comprehensive evaluation. The main contributions of this thesis are as follows:

(1) In order to solve the problem that the models trained on small-scale datasets in existing research are difficult to apply to more diverse emotion-related body gestures, a Generalized Zero-Shot Learning (GZSL) method is proposed for body-gesture-based emotion recognition, which can predict unseen categories with the help of designed semantic representations. In order to make full use of body movement information, an emotion category is regarded as a collection of multiple body gestures, and a framework that can satisfy the dual constraints of both gesture labels and emotion labels is proposed accordingly. Specifically, a generalized zero-shot learning network with two branches is proposed. These two branches are a Hierarchical Prototype Network (HPN) and a Semantic Auto-Encoder (SAE), which are used to predict the samples of the seen and unseen classes respectively. The hierarchical prototype network learns the two-level prototypes of body gestures and emotions with the help of prior knowledge of the relationship between emotions and body gestures, so as to enhance the separability of gesture categories and the intra-class similarity of emotion categories. The semantic auto-encoder is used to learn the mapping from the feature space to the semantic space, and predict samples from unseen categories with the help of the designed semantic representations containing both emotion and gesture information. Experimental results on the public MASR dataset demonstrate that the proposed method is superior to the existing GZSL algorithms and the baseline methods using only a kind of labels.

(2) In order to solve the problem of data distribution difference between the target group with only a few labeled samples and the general group with a large amount of training data, an adversarial transfer learning algorithm with domain mixup is proposed for physiological-signal-based stress detection. This model realizes the domain transfer from the general group to the target group by learning domain-invariant features. Specifically, domain-mixup samples at the feature level are designed and constructed to enhance the generalization of the domain discriminator, based on the adversarial transfer network including a feature extractor, a domain discriminator, and a stress detector. The feature extractor and the domain discriminator are optimized by adversarial training, which contributes to learning the features with domain invariance. These three modules are jointly trained to ensure that the features generated by adversarial learning will not negatively affect the performance of the stress detector. On this basis, aiming at the problem of the imbalanced label distribution in the stress detection task, a loss correction method based on the class prior probability is proposed to improve the recognition performance of the high-stress categories which have fewer training samples. Since there is still no public dataset for a specific target group, a physiological signal dataset including the general group and police school students is constructed for experimental evaluation. The experimental results demonstrate that the proposed algorithm is superior to the non-transfer-learning baseline algorithms and representative transfer learning algorithms.

(3) To solve the problem that the training of existing personalized models usually requires a large number of samples of the test subject, an emotion and stress detection method based on a siamese network is proposed. The method learns the differences between pairs of samples by constructing a relative intensity regression model, so that it can calibrate the personalized model using only one labelled baseline sample. On this basis, in order to better use the samples obtained by data segmentation for network training, a new intensity ranking sub-task is constructed to assist the regression task. Ranking rules and the construction method of the sample pairs are further designed according to the characteristics of the physiological data. To enhance the ability of the relative intensity representation of the proposed features, the strength ranking sub-task performs pairwise ranking with the help of the relative strength supervision information of the segment pairs. The above two sub-tasks share the siamese network for feature extraction, and are trained alternatively. The experimental results on the newly collected stress dataset and the public DEAP emotion dataset demonstrate that the proposed method outperforms the baseline methods based on single-sample calibration, and has similar performance to the personalized model trained with multiple samples.

On the basis of the above stress and emotion detection algorithms, a dual-modal system for the warning of abnormal mental states is further constructed for experimental verification. Based on the complementarity of body gestures and physiological signals, the system can realize comprehensive assessment and warning of abnormal states such as negative emotions and excessive stress. In order to evaluate the performance of this system, different psychological responses of subjects are induced by video materials, and a dual-modal database containing body gestures and physiological signals is built. Experimental results on this dataset demonstrate that the proposed system performs significantly better than the algorithms for single-modal data. 

Keyword情绪识别 心理压力检测 零样本学习 迁移学习 孪生网络
Language中文
IS Representative Paper
Sub direction classification人工智能+医疗
planning direction of the national heavy laboratory其他
Document Type学位论文
Identifierhttp://ir.ia.ac.cn/handle/173211/49701
Collection毕业生_博士学位论文
Recommended Citation
GB/T 7714
武金婷. 面向样本缺失场景的情绪与压力状态评估方法研究[D]. 中国科学院自动化研究所. 中国科学院自动化研究所,2022.
Files in This Item:
File Name/Size DocType Version Access License
武金婷-面向样本缺失场景的情绪与压力状态(7138KB)学位论文 限制开放CC BY-NC-SA
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[武金婷]'s Articles
Baidu academic
Similar articles in Baidu academic
[武金婷]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[武金婷]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.