知识图谱扩充方法研究

CASIA OpenIR > 毕业生 > 博士学位论文

	知识图谱扩充方法研究
	孙建
	2022-05-19
页数	121
学位类型	博士
中文摘要	知识图谱是由节点和边构成的巨型语义网，其中节点表示物理世界中的概念和实体，边表示节点之间的拓扑链接和语义关系。近年来，知识图谱作为智能化应用的基础关键技术已经成为各界研究者关注的焦点。知识图谱能够为智能搜索、问答、推荐等系统提供知识支撑。然而，现实世界中知识是不断变化的，人们对世界的描述也在不断更新和修正。因此，为了更好地满足系统应用的需求，必须不断地对知识图谱进行知识扩充。早期的知识图谱通常依赖人工构建和扩充，这种方式不仅效率低而且成本极高。因此，知识图谱的自动扩充方法便具有极高的研究与应用价值。本文的研究工作主要从扩充目标知识图谱时所需要的知识来源角度出发，关注如何从知识图谱本身、额外知识图谱和文本中挖掘相应的知识来扩充目标知识图谱。本文的主要贡献和创新点归纳如下： 1. 提出了一种基于邻域聚合和路径编码的单样本关系学习方法知识推理是一种简单易部署的知识图谱自动扩充方法。传统的数据驱动的知识推理模型很难处理大型知识图谱中仅覆盖少量知识的关系和实体的推理问题。针对样本稀少的关系和实体推理问题，本文提出了一种基于邻域聚合和路径编码的单样本关系学习方法。该方法利用关系平均注意力网络聚合实体的邻域信息，使得聚合后的实体同时包含潜在实体类型特征和邻域实体特征。路径编码模块则用于聚合实体间的路径信息，从而增强关系表示，减轻不可见实体给关系表示带来的影响。训练任务构造方法能够使模型在同一框架中对关系和实体进行预测。在三个小样本数据集上的实验表明，所提方法的关系和实体推理性能显著优于基于表示学习的强基线系统。 2. 提出了一种基于双注意力机制的跨语言实体对齐方法在利用实体对齐实现知识图谱自动扩充的方法中，对等实体之间同一层级的邻域信息往往是不一致的，不同层级之间邻域实体的数目也有较大的差异。这两种差异给对等实体的表示学习带来困难。本文提出了一种基于双注意力机制的实体对齐方法，该方法利用关系感知图注意力网络迭代聚合多层邻域信息，以解决对等实体之间同一层级中信息不一致的问题，然后利用层级注意力网络选择性地聚合低层级和高层级信息，解决不同层级间信息不平衡的问题。在三个跨语言实体对齐数据集上的实验表明，所提方法能够有效减小对等实体邻域之间的结构差异，显著提升实体对齐的性能。 3. 提出了一种基于完形填空的双向实体链接方法已有研究工作表明，基于序列决策的实体链接方法可以高效地进行知识图谱扩充。然而当前基于固定序列的实体链接模型忽略了提及的决策顺序，导致模型不能合理利用已链接的实体信息。针对此问题，本文首先提出了一种动态构造提及序列的方法。该方法利用强化学习算法，不断与先前已经链接的实体进行交互，动态选择一个可以合理利用先前信息的待链接目标。该方法可以为所有基于序列决策的实体链接模型提供合理的基础。此外，基于单向序列决策的实体链接方法存在全局信息利用不充分和潜在错误链接不能被纠正的问题。鉴于此，本文受人类在做完型填空时的行为启发，提出了一种带有检查与纠正功能的实体链接方法。该方法利用检查模块核验当前链接的实体是否正确。若正确，则作为证据参与下一个提及的决策；若不正确，则利用纠正模块对该提及进行重新决策。同时，重复上述检查和纠正步骤进行二次链接的策略可以有效解决信息利用不充分的问题。实验表明，所提方法能够充分合理地利用全局信息，显著提升实体链接的性能。
英文摘要	Knowledge graphs are a huge semantic web composed of nodes and edges, in which nodes represent concepts and entities in the real world, and edges represent topological links and semantic relations between nodes. In recent years, knowledge graphs have become the focus of researchers as a key technology of intelligent application. Knowledge graphs can provide a solid foundation for intelligent applications such as intelligent search, intelligent question answering systems, and recommendation systems. However, in the real world, knowledge is constantly changing, and people's description of the world is constantly updated and revised. Therefore, to better meet the needs of system application, we must constantly expand the knowledge graphs. Most of the early knowledge graphs were constructed and extended manually, which was not only inefficient but also costly. Therefore, the automatic expansion method of knowledge graphs has high research and application value. This paper focuses on how to mine corresponding knowledge from the knowledge graphs itself, additional knowledge graphs, and text to expand the knowledge graphs. The main contributions and innovations of this paper are summarized as follows: 1. Proposing One-Shot Relation Learning Method via Neighborhood Aggregation and Path Encoding Knowledge reasoning is a simple and easy-to-deploy automatic expansion method of knowledge graphs. The traditional data-driven knowledge reasoning models are difficult to deal with the reasoning problems of relations and entities covering only a small amount of knowledge in large knowledge graphs. To solve the problem of relation and entity reasoning with few samples, this paper proposes a relation and entity prediction method based on neighborhood aggregation and path encoding. This method uses the relation average attention network to aggregate the neighborhood information of entities so that the aggregated entities contain both potential entity type features and neighborhood entity features. The path encoding module is used to aggregate the path information between entities to enhance the representation of relations and reduce the impact of invisible entities on the representation of relations. The training task construction method can make the model predict the relations and entities in the same framework. Experiments on three one-shot relation learning datasets show that the relation and entity reasoning performance of the proposed method significantly outperforms the strong baseline system based on representation learning. 2. Proposing Dual Attention Network for Cross-lingual Entity Alignment In the method of using entity alignment to realize the automatic expansion of knowledge graphs, the neighborhood information at the same level between peer entities is often inconsistent, and the number of neighborhood entities at different levels is also quite different. These two differences bring difficulties to the representation learning of peer entities. This paper proposes an entity alignment method based on the dual attention mechanism. This method uses the relation-aware graph attention network to iteratively aggregate multi-layer neighborhood information to solve the problem of inconsistent information at the same level between peer entities and then uses the hierarchical attention network to selectively aggregate low-level and high-level information to solve the problem of information imbalance between different levels. Experiments on three cross-language entity alignment datasets show that the proposed method can effectively reduce the structural differences between neighborhoods of peer entities and significantly improve the performance of entity alignment. 3. Proposing Bidirectional Entity Linking Method Based Cloze Test The existing studies show that the entity linking method based on sequential decisions can expand the knowledge graphs efficiently. However, the current entity linking models based on fixed sequence ignore the decision order, which leads to the model can not make reasonable use of the linked entity information. To solve this problem, this paper first proposes a method of dynamically constructing mention sequences. This method uses the reinforcement learning algorithm to continuously interact with previously linked entities and dynamically select a target to be linked that can make reasonable use of previous information. This method can provide a reasonable basis for all entity linking models based on sequential decisions. In addition, the entity linking methods based on unidirectional decisions have the problem of insufficient utilization of global information and potential wrong links can not be corrected. Because of this, inspired by the behavior of human beings when completing the cloze test, this paper proposes an entity linking method with the function of checking and correction. This method uses the checking module to check whether the currently linked entity is correct. If correct, the entity will participate in the next decision as evidence; If incorrect, the correction module will be used to make a new decision on the mention. At the same time, the strategy of repeating the above checking and correction steps for the secondary linking can effectively solve the problem of insufficient utilization of information. Experiments show that the proposed method can make full and reasonable use of global information and significantly improve the performance of entity linking.
关键词	知识图谱扩充知识推理实体对齐实体链接
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/48780
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	孙建. 知识图谱扩充方法研究[D]. 中国科学院大学人工智能学院. 中国科学院大学人工智能学院,2022.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
学位论文-知识图谱扩充方法研究.pdf（4514KB）	学位论文		限制开放	CC BY-NC-SA