面向自然语言的量子概率启发图神经网络及其应用研究

CASIA OpenIR > 毕业生 > 博士学位论文

	面向自然语言的量子概率启发图神经网络及其应用研究
	闫鹏
	2022-05-22
页数	138
学位类型	博士
中文摘要	近年来，量子理论在理论经济学、心理学和认知科学等许多领域的交叉研究和应用，产生了很多富有启发性的成果。受到人类认知和语言理解中类量子现象的启发，已有不少研究工作尝试将量子理论引入信息检索、自然语言理解等人工智能领域。在自然语言理解方面，主要的工作是将量子理论与深度神经网络模型结合，用于建模自然语言，以增强模型可解释性。但是，自然语言处理面临固有的数据稀缺性挑战，如何有效利用句法、内容和对话等自然语言中丰富的结构信息来建模文本语义，是缓解数据稀疏问题的重要方法。现有的量子概率启发神经网络模型，缺乏对文本结构信息的关注。本论文研究如何通过图神经网络实现利用量子概率来建模自然语言的思路，在保留模型可解释性优势的同时，加强对文本结构信息的建模能力。论文提出了面向自然语言的量子概率启发图神经网络及其变体模型，分别从建模文档间关联结构、子句间交互结构和话语间交互结构的角度，探究了所提出的模型在典型自然语言处理和语义分析任务上的应用。本论文的主要研究内容与贡献包括： 1）提出了一种量子概率启发图神经网络（Quantum Probability-inspired Graph Neural Network，QPGNN）。QPGNN利用量子叠加态概念建模文本节点表示，通过量子混合态概念建模文本交互结构，采用量子测量进行文本分类或者高级语义提取。从建模文档间关联结构的角度，将QPGNN应用于文本分类任务。具体地，将文档视为节点，将语料库视为网络，构建文档交互图来捕获文档间语义关联交互结构；然后在文档交互图上学习QPGNN，实现对文档间全局结构信息的建模，以增强对文档的语义表示；最后通过对混合态执行量子测量来计算分类概率。实验结果表明，QPGNN的分类精度优于经典的神经网络模型和现有的量子概率启发模型。此外，QPGNN在有限训练数据集上的鲁棒性更好，参数效率高，并能够学习具有类别区分性的文档表示。 2）提出了一种量子概率启发图注意力网络（Quantum Probability-inspired Graph Attention Network，QPGAT），用以处理文本交互图为无权图的情况。QPGAT通过结合注意力机制学习边权重，挖掘出不同粒度文本隐含的交互结构信息。从建模子句间交互结构的角度，将QPGAT应用于情感-原因对抽取任务。具体地，将子句视为节点，将文档视为网络，通过构造位置交互图和内容交互图，实现对文档内子句间的位置交互和内容交互结构信息的建模；然后利用所提出的QPGAT对交互图建模，增强子句的语义表示；最后，通过多任务联合学习抽取器来抽取情感-原因对。实验表明，QPGAT在情感-原因对抽取任务，以及情感子句和原因子句抽取等相关子任务上取得了优于之前方法的性能。此外，QPGAT可以为抽取结果提供可视化解释。 3）从建模话语间交互结构的角度，将QPGAT应用于联合对话行为识别与情感分类任务。具体地，将话语视为节点，将对话视为网络，构造话语交互图来捕获对话中话语间的同一说话者交互和上下文交互；然后，应用QPGAT对所构建的话语交互图进行建模，增强话语的语义表示；最后，利用联合训练的多任务解码器进行对话行为识别与情感分类。实验结果表明，QPGAT模型优于其他基线模型。此外，消融分析验证了QPGAT各个模块的贡献，而对网络层数和话语交互图稠密度的敏感性分析则表明模型性能对网络层数变化不敏感，但交互图稠密度过大或过小都对性能产生负面影响。
英文摘要	请输入英文摘要 In recent years, the interdisciplinary research and application of quantum theory in many fields, such as theoretical economics, psychology, and cognitive science, have produced many enlightening results. Inspired by quantum-like phenomena in human cognition and language understanding, many research works have attempted to introduce quantum theory into artificial intelligence fields such as information retrieval and natural language understanding. In terms of natural language understanding, the main work is to combine quantum theory with deep neural network models for modeling natural language to enhance model interpretability. However, natural language processing faces inherent data sparse challenge. How to effectively use the rich structural information in natural language such as syntax structure, content structure, and dialogue structure to model text semantics is an important method to alleviate the data sparse problem. Existing quantum probability-inspired neural network models lack attention to structural information of texts. This paper studies how to realize the idea of using quantum probability to model natural language through graph neural network, thus retaining the advantages of model interpretability and strengthening the modeling ability of structural information of texts. This paper proposes a quantum probability-inspired graph neural network for natural language and its variant model. From the perspective of modeling inter-document association structure, inter-clause interaction structure and inter-utterance interaction structure, the application of the proposed model in typical natural language processing and semantic analysis tasks is explored. The main research contents and contributions of this paper include: 1) A Quantum Probability-inspired Graph Neural Network (QPGNN) is proposed. QPGNN uses the concept of quantum superposition state to model text node representation, uses the concept of quantum mixed state to model text interaction structure, and uses quantum measurement to perform text classification or advanced semantic feature extraction. From the perspective of modeling the association structure between documents, QPGNN is applied to text classification tasks. Specifically, the documents are regarded as nodes, the corpus is regarded as a graph, and a document interaction graph is constructed to capture the semantic association interaction structure between documents. Then the QPGNN is learned on the document interaction graph to model global structural information between documents and enhance the semantic representation of documents. Finally QPGNN computes classification probabilities by performing quantum measurements on mixed states. Experimental results show that the classification accuracy of QPGNN is superior to classical neural network models and existing quantum probability inspired models. Furthermore, QPGNN is more robust on limited training datasets, is parameter efficient, and is able to learn class-discriminative document representations. 2) A Quantum Probability-inspired Graph Attention Network (QPGAT) is proposed to deal with the situation of weightless text interaction graphs. QPGAT learns the edge weights by combining the attention mechanism and mines the interactive structure information of texts with different granularities. From the perspective of modeling the interaction structure between clauses, QPGAT is applied to the task of emotion-cause pair extraction. Specifically, the clauses are regarded as nodes, the documents are regarded as graphs, and the structural information of the position interaction and content interaction between clauses in the document are modeled by constructing the position interaction graph and the content interaction graph. Then the proposed QPGAT models the interaction graph to enhance the semantic representation of the clauses. Finally, a multi-task joint learning extractor is used to extract emotion-cause pairs. Experiments show that QPGAT achieves better performance than previous methods on the emotion-cause pair extraction task, as well as related subtasks such as emotion clause and cause clause extraction. In addition, QPGAT can provide a visual interpretation of the extraction results. 3) From the perspective of modeling the interaction structure between utterances, QPGAT is applied to the task of joint dialogue act recognition and sentiment classification. Specifically, utterances are treated as nodes and dialogues are treated as graphs, and the utterance interaction graph is constructed to capture the same-speaker interactions and contextual interactions between utterances in a dialogue. Then QPGAT models the constructed utterance interaction graph to enhance the semantic representation of the utterances. Finally, a jointly trained multi-task decoder is used for dialogue act recognition and sentiment classification. Experimental results show that the QPGAT model outperforms other baseline models. In addition, the ablation analysis verifies the contribution of each module of QPGAT, while the sensitivity analysis on the number of network layers and the density of the utterance interaction graph shows that the model performance is not sensitive to the number of network layers, but too large or too small density of the interaction graph can affect performance negatively.
关键词	量子概率图神经网络自然语言处理文本分类情感分析
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/48698
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	闫鹏. 面向自然语言的量子概率启发图神经网络及其应用研究[D]. 中国科学院自动化研究所. 中国科学院自动化研究所,2022.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
论文_闫鹏.pdf（7043KB）	学位论文		限制开放	CC BY-NC-SA