基于信息融合的知识图谱推理算法研究

CASIA OpenIR > 多模态人工智能系统全国重点实验室 > 互联网大数据与信息安全

	基于信息融合的知识图谱推理算法研究
	王子康
	2021-05-22
页数	132
学位类型	博士
中文摘要	知识图谱是存储知识的大规模图数据库，可以为各种人工智能任务提供外部知识支持，在知识搜索、智能问答、推荐系统等多个应用中具有重要的作用。但是，各类应用中使用的知识图谱大都具有稀疏和不完整的缺点，如何推理出缺失知识，丰富和扩展已有的知识图谱，是本领域重要的研究问题。现有的知识图谱推理模型大多利用神经网络直接进行推理，忽略了大量有价值且易于获取的信息，如何挖掘和利用这些信息来提升推理模型的性能，是当前研究面临的一大挑战。本论文针对基于表示学习的知识图谱推理、面向事实核查任务的知识图谱推理及面向问答任务的知识图谱推理三大场景，分别提出融合外部信息（包括多模态信息和时间信息）、结构信息（包括路径结构信息和图结构信息）与先验信息的知识推理模型，以编码更丰富的信息，提升推理的效果。本论文的主要研究贡献包括： (1) 对基于表示学习的知识图谱推理，针对外部信息未得到有效利用这一问题，本论文提出了两个融合外部信息的模型，分别实现对多模态信息和时间信息的融合。融合多模态信息的知识图谱推理模型，引入实体的文本描述和图像描述两种外部信息，通过结合多模态自编码器和TransE模型来学习同时编码多模态信息和三元组信息的实体和关系的表示。融合时间信息的知识图谱推理模型，给三元组引入时序信息，将时间对实体和关系表示的影响划分为三种类型，并提出了对应的处理模型。实验表明所提出的模型可以有效捕捉实体和关系的时序信息。 (2) 对面向事实核查任务的知识图谱推理，针对知识图谱结构信息未得到有效利用的问题，本论文提出了两个融合结构信息的模型，分别实现对路径结构信息和图结构信息的融合。融合路径结构信息的知识图谱推理模型，通过将注意力机制与路径表示相结合，相比于现有模型能更好地结合各条路径的语义信息，并能实现噪声过滤。融合图结构信息的知识图谱推理模型，将头实体与尾实体之间的图看作一个子图序列，利用图神经网络在头实体和尾实体之间逐步传递子图信息，实现头实体与尾实体之间关系表示的学习。实验表明，图结构信息不仅可以提升推理效果，还可以提高推理效率。 (3) 对面向问答任务的知识图谱推理，针对知识图谱自身信息未得到充分挖掘和利用的问题，本论文提出了一种融合先验信息的模型。首先，模型根据知识图谱中已有的知识，构建反事实样本，并根据反事实样本来抽取先验知识；进一步，模型将抽取到的先验知识引入到基于强化学习的推理模型中，实现基于先验知识和神经网络的共同推理。在多个数据集上的实验和分析发现，先验知识不仅可以提升知识推理模型的效果，同时还能在一定程度上改善现有模型效果受推理路径长度限制的问题。 (4) 展示并验证了本论文提出的知识图谱推理方法在自然语言处理应用中的效果。以自然语言推理任务为例，应用本论文第四章中提出的面向事实核查任务的推理方法，给传统自然语言推理模型引入基于知识图谱的关系学习模块，实现了为传统自然语言推理任务引入外部知识的效果，给模型效果带来了提升。
英文摘要	Knowledge graphs are large-scale graph databases that store knowledge, which can provide external knowledge support for a variety of artificial intelligence tasks, and play an important role in various applications such as information retrieval, question answering, and recommendation systems. However, most of the knowledge graphs used in applications have the drawbacks of being sparse and incomplete. It is an important research problem in this field to reason about the missing knowledge and to enrich the existing knowledge graphs. Most of the existing knowledge graph reasoning models use neural networks to reason directly, ignoring a large amount of valuable and easily accessible information. How to mine and utilize this information to improve the performance of reasoning models is a major challenge for current research. In this thesis, we propose knowledge graph reasoning models that fuse external information (including multimodal and temporal information), structural information (including path structure information and graph structure information) and prior information to encode richer information and improve the effectiveness of reasoning for three major scenarios: representation learning-based knowledge graph reasoning, fact-checking task oriented knowledge reasoning and question-answering task oriented knowledge graph reasoning, respectively. The main contributions of this thesis are as follows: (1) To address the problem that external information is not effectively utilized, two models fusing external information are proposed for representation learning-based knowledge graph reasoning, which fuse multimodal information and temporal information, respectively. In the knowledge graph reasoning model fusing multimodal information, two types of external information, textual and visual descriptions of entities, are introduced to learn the representation of entities and relationships that encode both multimodal and triple information by combining multimodal autoencoder and TransE. In the knowledge graph reasoning model fusing temporal information, temporal information is introduced to the triples, and the effect of time on entity and relationship representations is classified into three types, three corresponding models are proposed respectively. Experiments show that the proposed models can effectively capture the temporal information of entities and relations. (2) To address the problem that structural information is not effectively utilized, two models fusing structural information are proposed for fact-checking oriented knowledge graph reasoning, which fuse path structural information and graph structural information, respectively. In the knowledge graph reasoning model fusing the path structure information, the attention mechanism is used to combine the path representations, which better combines the semantic information of each path compared with the existing models and enables noise filtering. In the knowledge graph reasoning model incorporating graph structure information, the graph between head entity and tail entity is regarded as a sequence of subgraphs, using graph neural networks, the relationship representation between head entity and tail entity is learned by passing the subgraph information step-by-step. Experiments show that graph structure information can not only improve the reasoning performance, but also the reasoning efficiency. (3) To address the problem that the information in the knowledge graph itself is not fully explored and utilized, a model that incorporates prior information is proposed for question answering-oriented knowledge graph reasoning. First, counterfactual samples are constructed based on the existing knowledge graph, then prior knowledge is extracted based on the counterfactual samples. Further, the extracted prior knowledge is introduced into the reinforcement learning-based reasoning model. The model reasons based on both the prior knowledge and neural networks. Experiments and analyses on several datasets find that prior knowledge can not only improve the effect of the knowledge reasoning models, but also can mitigate the performance degradation in multi-hop reasoning when the reasoning path is overlong. (4) The effectiveness of the knowledge graph reasoning approaches proposed in this thesis as applied to natural language processing is illustrated and validated. Taking the natural language inference task as an example, the fact checking task-oriented reasoning method proposed in Chapter 4 of this thesis is applied to introduce a knowledge graph-based relational learning module to the traditional natural language inference model, achieving the effect of introducing external knowledge to the traditional natural language inference task and improving the inference performance of the model.
关键词	知识图谱知识表示知识推理自然语言处理
语种	中文
七大方向——子方向分类	知识表示与推理
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/44750
专题	多模态人工智能系统全国重点实验室_互联网大数据与信息安全
推荐引用方式 GB/T 7714	王子康. 基于信息融合的知识图谱推理算法研究[D]. 中国科学院自动化研究所. 中国科学院大学,2021.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
Thesis.pdf（4220KB）	学位论文		开放获取	CC BY-NC-SA