CASIA OpenIR  > 毕业生  > 博士学位论文
基于平行学习的多源异构知识协同方法与应用研究
杨林瑶
Subtype博士
Thesis Advisor王飞跃 ; 王晓
2022
Degree Grantor中国科学院自动化研究所
Place of Conferral中国科学院自动化研究所
Degree Discipline社会计算
Keyword平行学习 知识协同 知识图谱 图神经网络 表示学习
Abstract

知识图谱基于结构化三元组描述事实,能够有效刻画现实世界实体间的语 义关系,进而为智能决策系统提供知识支持。近年来,随着人工智能技术的发展 及知识引导的机器学习和可解释人工智能引起重视,大量知识图谱被构建,并在 语义搜索、推荐系统等领域取得了成功应用。知识图谱已成为新一代人工智能的 关键共性技术,是实现机器认知智能、推动产业智能化升级的重要支撑。

现实中,针对同一领域往往存在多个知识图谱,其在知识范畴上具有一定的 互补性。多源知识图谱的有效协同有助于提升知识的完备性,进而为智能决策提 供更好的支持。然而,异构的多源知识图谱在结构和属性特征上往往存在较大 差异,其知识协同在准确度、灵活性和效率等方面仍然存在很多问题和挑战。此 外,由于缺少稀疏、极端场景的数据,已有知识图谱缺少关于此类场景的知识, 使得多源知识协同仍然难以充分覆盖实际系统管控所需的知识。

针对上述问题,本文提出了通过平行学习理论的描述学习、预测学习及引导 学习过程分别实现多源异构知识的抽取、融合补全与应用,实现了多源异构知识 的高效协同。为验证所提方法的有效性,本文在电网调控这一复杂系统管控问题 上进行了仿真试验。本文的主要工作总结如下:

1. 提出了一种基于严肃游戏和决策树的人类决策行为建模方法。针对稀疏、 极端场景的知识缺失问题,本文提出了通过描述学习构建人工系统,以生成人工 数据进而提取相应场景的知识。针对人工系统构建中的人类决策行为建模难题, 本文提出了一种基于严肃游戏和决策树的人类决策行为建模方法。该方法利用 虚拟游戏采集极端场景下的人类决策数据,基于相关性分析提取关键特征以模 拟人类基于关键要素进行决策的机理,随后构建随机森林以合成数据扩充决策 样本,最终基于扩充后的数据构建可解释的决策树作为人类决策行为模型。为了 显式刻画系统管控知识,本文利用异构信息网络将实际系统中的真实管控记录 和人工系统生成的虚拟数据组织成知识图谱,通过将系统状态和管控动作之间 的转移关系组织成三元组,实现了对系统管控知识的结构化表征和存储。

2. 提出了一种基于信息融合的协同实体对齐方法及一种基于向量空间匹配 的无监督实体对齐方法。针对缺失知识的补全问题,本文提出了通过预测学习根据已有知识开展缺失场景的计算实验,并基于实体对齐合并等价实体以融合多 源知识图谱,从而形成统一的知识库以提升计算实验效率。为了有效融合结构、 关系、属性等显式信息并建模一对一约束以提升对齐精度,本文提出了一种基于 信息融合的协同实体对齐方法。该方法基于向量表示上述三种信息,进而利用协 同注意力融合各种向量的相似度以计算实体综合相似度,最终基于整数规划协 同匹配等价实体。为了降低实体对齐对于对齐种子和属性信息的依赖,本文提出 了一种基于向量空间匹配的无监督实体对齐方法。该方法利用图神经网络学习 实体的结构嵌入向量,基于对无监督实体对齐等价问题的求解获得向量空间变 换矩阵,并利用循环生成式对抗网络优化该矩阵,从而将不同知识图谱的嵌入向 量映射到相同向量空间,最终根据映射后的向量相似度协同匹配等价实体。

3. 提出了一种基于强化学习的协同知识推理方法。本文提出了融合多源知 识图谱的互补知识补全隐含知识,以进一步降低计算实验的工作量,并提出了一 种基于强化学习的协同知识推理方法。该方法通过跨知识图谱等价关系路径灵 活建模多源知识图谱的互补知识,并利用强化学习智能体自主推导待推理关系 类型的等价关系路径,有效提升了知识推理的准确度。针对多源知识图谱结构和 特征差异及实体和关系规模扩大对智能体训练带来的挑战,本文提出了利用实 体对齐为多源知识图谱学习同一向量空间下的表示向量,并设计了包含长短时 记忆和分层图注意力的新型策略网络,以及动作掩码、基于采样路径重训练等训 练机制,有效提升了等价路径推导的成功率。

4. 提出了一种基于表示学习的图迁移学习方法及一种基于最短路径的电网 调控方案推荐方法。为了实现结构化知识图谱在实际系统中的无偏化应用,本文 提出了通过引导学习匹配与实际系统当前状态最相关的知识,并提出了一种基 于表示学习的图迁移学习方法。该方法能够基于最优传输距离损失和分类损失 学习领域自适应且类可分的节点表示,有效提升了源网络分类模型在目标网络 上的预测精度。为了实现基于知识图谱的电网调控辅助决策,本文提出了一种基 于最短路径的电网调控方案推荐方法,能够从知识图谱中直接生成电网调控指 令,并且所得方案在满足电网调控要求的同时能够有效降低成本。

Other Abstract

Knowledge graphs represent facts based on structured triples, which can effectively describe the semantic relationships between real-world entities and provide knowledge support for intelligent decision-making systems. In recent years, with the development of artificial intelligence and the emphasis on knowledge-guided machine learning and explainable artificial intelligence, many knowledge graphs have been constructed and successfully applied in the fields of semantic search, recommender systems, etc. Knowledge graph has become a critical standard technology of the new generation of artificial intelligence. It provides essential support for realizing cognitive machine intelligence and promoting the upgrading of industrial intelligence.

There are often multiple knowledge graphs regarding the same domain, which are complementary in knowledge. The effective collaboration of multi-source knowledge graphs helps improve the completeness of knowledge, providing better support for intelligent decision-making tasks. However, heterogeneous multi-source knowledge graphs often have significant differences in structure and attribute characteristics, which makes their knowledge collaboration still have many problems in terms of accuracy, flexibility, and efficiency. In addition, existing knowledge graphs lack the knowledge of sparse and extreme scenarios due to the sparseness of corresponding real data, making it difficult for multi-source knowledge collaboration to fully cover the knowledge required for the management and control of the physical system.

This thesis proposes to realize the extraction, fusion, completion, and application of multi-source heterogeneous knowledge through the descriptive learning, predictive learning, and prescriptive learning processes of parallel learning theory, which realizes the efficient collaboration of multi-source heterogeneous knowledge. Simulation experiments are conducted on power grid dispatching to evaluate the effectiveness of the proposed methods, which is a typical management and control problem of complex systems. The main work of this thesis is summarized as follows.

1. A human decision-making behavior modeling method based on serious game and decision tree is proposed. To solve the lack of knowledge of sparse and extreme scenarios, this thesis proposes to develop artificial systems through descriptive learning to generate artificial data to extract knowledge about corresponding scenarios. To overcome the difficulty of human decision-making behavior modeling in the construction of artificial systems, this thesis presents a human decision-making behavior modeling method based on serious game and decision tree. This method collects human decision data in extreme scenarios through virtual games. Then, the key features are extracted based on correlation analysis to simulate the mechanism of human decision-making based on critical elements, and a random forest model is constructed with these features to synthesize virtual data to expand the training samples. In the end, an interpretable decision tree is constructed as the human decision-making behavior model based on the expanded data. To explicitly represent the management and control knowledge of the complex system, this thesis utilizes the heterogeneous information network to organize the real management and control records in the actual system and the virtual data generated by artificial systems into knowledge graphs. The structured representation and storage of system management and control knowledge are realized by organizing the transfer relationship between system states and control actions into triples.

2. A collective entity alignment method based on information fusion and an unsupervised entity alignment method based on matching different embedding spaces are proposed. To complete the missing knowledge, this thesis proposes to conduct computational experiments of corresponding scenarios through predictive learning according to existing knowledge and fuses multi-source knowledge graphs by merging equivalent entities with entity alignment, which improves the efficiency of computational experiments by forming a unified knowledge base. To integrate the explicit structural, relational, and attribute information effectively and model the one-to-one constraint to improve the alignment accuracy, this thesis proposes a collective entity alignment method based on information fusion. This method represents the above three kinds of information with vectors and then calculates synthesized entity similarities through the fusion of the similarities of different information based on the co-attention mechanism. In the end, equivalent entities are collectively assigned based on integer programming. To reduce the dependence of entity alignment on alignment seeds and the attribute information, this thesis proposes an unsupervised entity alignment method based on matching different embedding spaces. This method learns entities’ structural embeddings with graph neural networks and then obtains vector space transformation matrices by solving the equivalent math problem of unsupervised entity alignment and uses a cycle-consistent generative adversarial network to optimize these matrices to match the embedding vectors of different knowledge graphs. In the end, equivalent entities are collectively assigned according to the similarities between the mapped vectors.

3. A collaborative knowledge reasoning method based on reinforcement learning is proposed. This thesis proposes to complete the implicit knowledge based on the complementary knowledge to further reduce the workload of computational experiments and proposes a collaborative knowledge reasoning method based on reinforcement learning. This method models the complementary knowledge in multi-source knowledge graphs with the cross-knowledge graph equivalent relation paths and utilizes reinforcement learning agents to infer equivalent relation paths of the relation type to be reasoned, which effectively improve the accuracy of knowledge reasoning. To overcome the challenges introduced by the differences in structure and features of multi-source knowledge graphs and the expansion of entities and relationships, this thesis proposes to utilize entity alignment to learn representation vectors in the same vector space for different knowledge graphs. Besides, a novel strategy network based on long-short term memory and hierarchical graph attention is designed, and some training mechanisms, including action mask and retrain with sampled paths, are proposed. These designs effectively improve the success rate of equivalent path deduction.

4. A graph transfer learning method based on representation learning and a power grid dispatching scheme recommendation method based on the shortest path are proposed. To achieve the unbiased application of structured knowledge graphs in the physical system, this thesis proposes to match the knowledge most related to the actual system’s current state to optimize the parallel execution process through prescriptive learning and proposes a graph transfer learning method based on representation learning. This method can learn domain-adaptive and label-discriminative node embeddings based on optimal transport distance loss and classification loss, which effectively improves the prediction accuracy on the target network of the node classification model learned from the source network. To realize the auxiliary decision-making for power grid dispatching based on the knowledge graph, this thesis proposes a power grid dispatching scheme recommendation method based on the shortest path, which can directly generate power grid dispatching instructions from the power grid dispatching knowledge graph. Besides, the obtained scheme can not only meet the dispatching requirements but also can reduce the cost.

Subject Area人工智能 ; 知识工程
MOST Discipline Catalogue工学 ; 工学::计算机科学与技术(可授工学、理学学位)
Pages130
Language中文
Document Type学位论文
Identifierhttp://ir.ia.ac.cn/handle/173211/48713
Collection毕业生_博士学位论文
Recommended Citation
GB/T 7714
杨林瑶. 基于平行学习的多源异构知识协同方法与应用研究[D]. 中国科学院自动化研究所. 中国科学院自动化研究所,2022.
Files in This Item:
File Name/Size DocType Version Access License
大论文终稿.pdf(42420KB)学位论文 限制开放CC BY-NC-SA
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[杨林瑶]'s Articles
Baidu academic
Similar articles in Baidu academic
[杨林瑶]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[杨林瑶]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.