基于图神经网络的篇章表征研究

CASIA OpenIR > 毕业生 > 博士学位论文

	基于图神经网络的篇章表征研究
	吴浩然
	2023-05
页数	128页
学位类型	博士
中文摘要	随着信息技术的发展，各行各业时刻在产生海量的篇章文档，对这些文档中的复杂信息进行高效的自动化处理，并提高下游应用的效率，是当前一个迫切的需求。基于图神经网络的篇章表征旨在利用图结构对篇章文档中的拓扑信息进行建模，从而获得带有结构化先验约束的篇章编码表示，为下游任务提供更有效的支撑。使用图神经网络对篇章文档进行表征的要点在于如何发现不同粒度语义节点之间的关联性，主流的方法使用篇章的组织结构或语义结构来构建这种关联性，用于增强篇章自身语义信息的表征。然而，基于篇章自身结构增强的表征并不是针对下游任务进行优化的，因此与下游目标之间会存在偏差，这在多篇章、多任务、多模态等复杂推理中会造成混淆。为此，本文基于图神经网络，首先研究利用领域知识来实现篇章表征面向特定领域的表征增强；然后在相似图结构的基础上，研究利用级联式反事实推理来估计语义节点与下游任务目标之间的因果关系；随后将图上的反事实推理拓展为端到端的形式，并利用其获得的因果关系实现在多篇联合推理和跨模态信息对齐的复杂场景下针对下游任务目标的篇章表征增强。本文的主要研究成果总结如下： 1. 面向篇章/标签联合表征的子句交互超图研究利用额外的领域专家知识增强篇章文档的表征，是将自然语言处理技术应用到垂直领域的一个有效手段。现有的方法大多是将外部知识单向地注入到篇章表征过程中，而不对外部知识的表征进行约束，这就导致两者之间缺乏双向反馈的高层语义交互。为此，本文提出了一个面向篇章/标签联合表征的子句交互超图模型。在此模型中，本文首先提出了一个子句交互超图来对篇章文档和带描述的结构化标签集进行联合建模，其利用交互连接将标签集的结构化描述信息注入到篇章表征中，基于超图利用篇章的语义结构来对标签集的表征进行约束。随后，本文提出了一个两阶段的混合超图卷积算法，利用图神经网络对子句交互超图中的超图结构和二项图结构进行迭代表征。基于大规模医疗代码分配任务的实验表明，本文所提出的方法能够在篇章表征过程中有效实现领域知识的注入，并成功利用篇章语义结构对结构化知识的表征进行语义约束，显著提高了下游任务的预测精度。 2. 面向篇章可解释性的反事实推理研究在医疗、金融等高风险场景中对篇章文档进行表征，需要模型能够为结果提供有效的可解释性支撑。现有的方法在模型解释过程中大多需要引入额外的专家知识，然而在许多细分场景中，由于人力的限制，往往无法获得足够精确的知识标注。为此，本文提出了一个基于图神经网络的级联式反事实推理篇章表征解释算法。该算法通过图结构将非结构化篇章文档离散化，进而构建为含有不同粒度语义单元的层次图，其表征用于下游任务的预测。在完成训练的篇章表征模型的基础上，本文提出了两种分别基于节点处理和边处理的反事实推理方法，来估计不同语义单元与下游任务目标之间的关联，从而抽取出面向下游任务目标的因果图结构，并利用强相关的语义单元为模型提供针对每一个样本的解释支撑。基于真实电子病历的分析实验表明，本文提出的算法能够有效实现基于电子病历的自动诊断，并为诊断结果提供符合医学逻辑的可解释支撑。 3. 面向篇章混淆信息过滤的局部因果关联图研究如何对从开放域中检索得到的大量文档进行联合推理，以获得所需的目标信息，是一个长久以来的挑战。其难点在于，在检索得到的文档中存在大量与下游任务无关的混淆信息，它们在全局推理中，会成为造成严重干扰的噪声。为此，本文提出了一个面向篇章混淆信息过滤的局部因果关联图模型。在此模型中，本文首先将工作 2 中级联式的反事实推理算法改进为端到端的形式，从而能够在与下游任务的联合优化的过程中，估计特定语义单元的因果效应。随后，基于篇章结构信息，本文提出了一个局部因果关联图，用于对多文档联合推理过程进行建模，其使用语义单元节点的因果效应对混淆信息节点进行过滤，从而构建面向下游任务的最优因果图结构，并强化有效信息节点的语义表征。在基于跨文档关系抽取任务的实验表明，本文所提出的方法能准确分辨复杂信息中有效和混淆语义单元，并在全局推理中将混淆信息进行过滤，显著提升了下游任务的准确性。此外，本文还验证了因果推理准确性与基于图神经网络的篇章表征能力之间存在的相互增强作用。 4. 面向篇章跨模态信息因果对齐的特征归因图研究在篇章表征过程中融合跨模态信息是将篇章表征技术推向通用领域的一个趋势，其难点在于如何构建跨模态信息之间的对齐模式。通过语义相似度匹配构建跨模态特征之间的关联是当前主流的途径，然而这种关联性并不与下游任务直接相关，这就导致基于语义相似度匹配构建的跨模态表征与下游任务之间存在偏差。为此，本文提出了一个面向篇章跨模态信息因果对齐的特征归因图模型。在此模型中，本文首先提出了一个面向多任务场景、基于因果推理的特征归因算法，能够评估细粒度特征相对于特定下游任务的重要程度。随后，本文构建了一个用于跨模态信息对齐的特征归因图，利用特征归因值计算跨模态节点特征之间的对齐权重，实现面向特定下游任务的跨模态篇章表征。在多模态期望理解任务中的实验表明，本文所提出的方法能够有效减少跨模态篇章表征与下游任务之间的偏差，在多个子任务上都获得了显著的性能提升。
英文摘要	With the development of information technology, various industries are constantly producing massive amounts of documents. It is an urgent need to efficiently and automatically process the complex information of these documents and to improve the efficiency of downstream applications. The document representation based on the graph neural network aims to model the topological information in documents using graph structures, which can obtain the encoded representation of documents with structured prior constraints, and provide more efficient support for downstream tasks. The main point of using graph neural networks to represent documents is how to identify the correlations between multi-granularity semantic nodes. The mainstream methods use the organizational structure or semantic structure of the document to construct this correlation, which is used to enhance the representation of the semantic information of the document itself. However, the representation enhanced based on the structure of the document itself is not optimized for the downstream task, and thus there will be deviations from the downstream targets, which can cause confusion in complex reasoning such as multi-document, multi-task, and multi-modal reasoning. To address this, based on the graph neural network, this thesis firstly studies the use of domain knowledge to achieve domain-specific representation enhancement of document representations. Based on the similar structure, this thesis then studies the estimation of causal associations between semantic nodes and downstream targets based on counterfactual reasoning. Subsequently, this thesis extends counterfacutal reasoning on the graph to an end-to-end form and uses its obtained causal associations to achieve document representation enhancement for downstream targets in complex scenarios of cross-document joint reasoning and cross-modality information alignment. The main research results of this thesis are summarized as follows: 1. Joint modeling of the document and label with clause interaction hypergraph Enhancing the representation of documents with additional domain expert knowledge is an effective means to apply natural language processing techniques to vertical domains. Most of the existing methods inject external knowledge into the document representation process unidirectionally without constraining the representation of external knowledge, which leads to the lack of high-level semantic interaction with bidirectional feedback between the two. To this end, this thesis proposes an algorithm called joint modeling of the document and label with clause interaction hypergraph. In this algorithm, the thesis first proposes a clause interaction hypergraph to jointly model the document and the structured label set, which injects the structural description information of the label set into the document representation by the interaction connections, and uses the semantic structure of the document to constrain the representations of the label set based on the hypergraph. Subsequently, this thesis proposes a two-stage hybrid hypergraph convolution algorithm to iteratively represent the hypergraph structure and the simple graph structure in the clause interaction hypergraph with the graph neural network. Experiments based on the large-scale medical code assignment task show that the method proposed in this thesis can effectively inject domain knowledge into the document representation process, and successfully use the document semantic structure to impose semantic constraints on the representation of structured knowledge, which significantly improves the prediction precision of downstream tasks. 2. Counterfactual reasoning for document representation interpretability Representing documents in high-risk scenarios such as healthcare and finance requires models that can provide effective interpretive support for the results. Most of the existing methods require the introduction of an additional expert knowledge base in the model interpretation process. However, in many subdivision scenarios, it is difficult to obtain sufficiently accurate knowledge annotation due to human limitations. To this end, this thesis proposes a cascaded counterfactual reasoning document representation interpretation algorithm based on graph neural networks. The algorithm discretizes unstructured documents through the graph structure, and then constructs a hierarchical graph containing different granularity semantic units, whose representations are used for the prediction of downstream tasks. Based on the completed trained document representation model, this thesis proposes two counterfactual reasoning methods based on node treatment and edge treatment respectively to estimate the association between different semantic units and downstream task targets, and extract the causal graph structure oriented to the downstream task targets. Strongly associated semantic units are utilized to provide interpretability of the model for each instance. Experiments based on real electronic medical records show that the algorithm proposed in this thesis can effectively realize automatic diagnosis based on electronic medical records and provide interpretable support in accordance with medical logic for the diagnosis. 3. Local causal association graph for document confusing information filtering It is a long-standing challenge to perform joint reasoning on a large number of documents retrieved from the open domain to obtain the desired target information. The difficulty lies in the fact that there is a lot of confusing information irrelevant to downstream tasks in the retrieved documents, and they will become noises that cause serious interference in global reasoning. To this end, this thesis proposes a local causal association graph model for confusing information filtering in the document. In this model, this thesis first improves the cascaded counterfactual reasoning algorithm of work 2 to an end-to-end form, which can estimate the causal effects of specific semantic units in the process of joint optimization with downstream tasks. Subsequently, based on the document structure information, this thesis proposes a local causal association graph to model the multi-document joint reasoning process, which uses the causal effect of semantic unit nodes to filter confusing information nodes, so as to construct the optimal causal graph structure for downstream tasks and enhance the semantic representation of valid information nodes. Experiments in the cross-document relation extraction task show that the proposed method in this thesis can accurately distinguish valid and confusing semantic units in complex information and filter the confusing information in global reasoning, which significantly improves the accuracy of downstream tasks. In addition, this thesis also verifies the mutual enhancement between the accuracy of causal reasoning and the ability of document representation based on the graph neural network. 4. Feature attribution graph for causal alignment of document cross-modality information Incorporating cross-modality information in the process of document representation is a trend that pushes natural language processing technology into the general domain. The difficulty is how to construct the alignment pattern between cross-modality information. Constructing associations between cross-modality features through semantic similarity matching is the current mainstream approach. However, this correlation is not directly related to the downstream task, which leads to bias between the cross-modality representations constructed based on semantic similarity matches and downstream tasks. To this end, this thesis proposes a feature attribution graph model for the causal alignment of cross-modality information. In this model, this thesis first proposes a causal reasoning based feature attribution algorithm for multi-task scenarios, which is able to evaluate the importance of fine-grained features relative to a specific downstream task. Subsequently, this thesis constructs a feature attribution graph for cross-modality information alignment, which uses feature attribution values to calculate alignment weights between cross-modality node features to achieve cross-modality document representation for a specific downstream task. Experiments in a multimodal desire understanding task show that the proposed method in this thesis can effectively reduce the bias between cross-modality document representation and downstream tasks, and obtain significant performance improvements on multiple subtasks.
关键词	篇章表征图神经网络因果推理
学科领域	自然语言处理
学科门类	工学::控制科学与工程
语种	中文
七大方向——子方向分类	自然语言处理
国重实验室规划方向分类	语音语言处理
是否有论文关联数据集需要存交	否
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/52132
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	吴浩然. 基于图神经网络的篇章表征研究[D],2023.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
201818014628015吴浩然.p（4884KB）	学位论文		限制开放	CC BY-NC-SA