时序知识建模与推理方法研究

CASIA OpenIR > 毕业生 > 博士学位论文

	时序知识建模与推理方法研究
	邵朋朋
	2023-05-23
页数	126
学位类型	博士
中文摘要	随着互联网技术的不断发展，海量的数据信息如潮水般涌进公众的视野，特别是在微博、微信、百度、谷歌等信息交流以及检索平台，而其中也蕴藏着大量有价值的知识。在此背景下，为了有效管理这些数据知识并对其深度挖掘和利用，知识图谱应运而生。知识图谱是一种图结构知识库，旨在描述真实世界中存在的各种实体或概念及其之间的关系，可以将事件表达成更易于认知客观世界的三元组形式，在推荐系统、智能问答以及信息检索等下游任务中有着重要应用。尽管知识图谱在管理知识方面有很大的优势，但随着存储的数据越来越多，数据渐渐呈现出动态性和时序性，知识的正确性随着时间的推移而改变，如（特朗普，总统，美国）在2017年至2021年期间是正确的，在2022年则是不正确的。所以，知识图谱这种数据信息忽略了重要的时间信息，而无法满足现实世界中动态大数据的管理需求。时序知识图谱针对这一问题在知识图谱三元组的基础上引入了时序信息，不仅确保了知识的正确性，而且也继承了知识图谱在知识建模和管理上的优势。由此，时序知识图谱开始登上了时代的舞台。相应地，在时序知识图谱上的知识建模和推理任务自然而然也成为了当前的研究热点，在大数据分析中有着很大的应用。所以，本文主要以时序知识图谱这种时序信息数据为研究对象，展开对知识的建模和推理的研究工作，以研究和挖掘各互联网平台中海量的时序信息数据中的隐藏知识和发展规律，并对可能发生的事实进行推理预测，这在目标行为画像预测、情报分析、智能问答、以及疫情预测等场景中具有重要的应用价值。本文针对当前推理当中的全局知识建模、历史知识与查询的相关性分析、历史相关信息的缺失、以及时间推理的不确定性等问题，从内插和外推两种时序知识图谱推理任务分别进行研究。本文的研究内容和主要创新点主要包括以下几个方面： (1) 研究基于Tucker分解的时序知识推理方法本研究针对时序信息数据当中存在的静态和动态的两种知识建模的问题，提出了基于Tucker分解知识的建模和推理模型。在此基础上，为了将时间的先验信息融合到模型当中，本研究提出三种时序正则化方法，对所学习的时间表示进行约束，使其相邻的时间具有相似的表示。此外，针对Tucker分解模型中拥有大量参数，容易产生模型过拟合问题，本研究提出两种表示正则化方法来辅助基础模型进行训练。实验结果表明，本文提出的方法在时序知识推理任务中取得了较好的实验性能。 (2) 研究基于层级图注意力网络的时序知识推理方法本研究针对复杂的历史事件与查询的相关性计算问题，从时间和语义两个角度切入，提出了基于层级图注意网络的时序知识推理模型。首先，从时间角度，本文提出时间上距离推理事件越近的历史事件，对查询的影响越大。而这个思想随之而带来的一个问题是，时间上近距离的事件可能与查询无关，而时间上远的事件可能对查询有很大的影响。为了解决这一问题，本文从语义的角度提出语义上越接近查询的相关历史事件对查询的影响越大。从两个角度设计的模型可以相互弥补双方在相关性计算上的不足。最后，本文设计出聚合机制将两种模型进行融合，得到一个较为合理的复杂历史事件和查询的相关性计算模型。进而将其融入到图神经网络当中以学习实体和关系的表示，并利用研究(1)中的解码器模型对知识进行推理。实验结果表明，所设计的方法在时序知识推理中达到较好的效果。 (3) 研究基于自适应伪孪生策略网络的时序知识推理方法本研究针对在推理时缺乏与所要推理事件（查询）相关的历史信息的问题，提出了基于自适应伪孪生策略网络的时序知识推理方法。由于头实体是事件的主体，当根据头实体无法找到直接相关的历史信息时，无法依据可观察的不相关历史信息推理出查询的答案。为了解决这一问题，本文提出可以根据查询中的关系从历史事件找出间接相关的事件。因为四元组中的关系反应了事件的语义信息，所以拥有和查询相同关系的历史事件，则说明它们有相似的语义。基于以上观察和分析，本文通过关系为此类查询找到间接相关历史事件，从而获得语义动作的选择。为了最优化有历史信息的查询和无历史信息的查询的答案，本文设计伪孪生网络在统一框架内分别处理这两种情况。实验结果表明，与当前基线模型相比，本文所提模型在处理这两种情况中达到了最优性能。 (4) 研究基于贝叶斯超网络与时间差分演化网络协同的时序知识推理方法本研究针对事件时间推理的不确定性问题，提出在实体和关系的编码、以及时间推理的解码上设计模型。具体来说，所设计的模型在编码上将时间信息融入实体和关系表示当中，在解码时利用贝叶斯超网络对时间推理的不确定性进行建模。这使得模型不仅可以在一个框架中完成实体、关系、和时间推理三种任务，还对时间推理的准确性有了很大的提升。实验结果表明，相比当前主流时间推理模型，所提模型在时间推理的准确度上有很好的提升。本文的方法和结论对于进一步拓展时序知识图谱在实际生活中的应用具有重要的指导意义。
英文摘要	With the continuous development of Internet technology, massive amounts of data and information flood into the public's field of vision, especially in information exchange and retrieval platforms such as Weibo, WeChat, Baidu, and Google, which contain abundant valuable knowledge. In this context, to effectively manage, deeply mine and utilize data knowledge, the knowledge graph emerges. The knowledge graph is a graph-structured knowledge base, which aims to describe various entities or concepts in the real world and the relationship between them. It can express events in the form of triples which are easier to recognize the objective world, and it has important applications in downstream tasks such as recommendation systems, intelligent question answering, and information retrieval. Although the knowledge graph has great advantages in managing knowledge, as growing data are stored, the data gradually presents a dynamic and temporal nature, and the correctness of knowledge changes over time, such as (Trump, President, United States) is true for 2017-2021 and incorrect for 2022. Knowledge graph ignores important time information, and can not meet the management needs of dynamic big data in the real world. To solve this problem, time information is added to the temporal knowledge graph on the basis of the knowledge graph triplet, which not only ensures the correctness of knowledge but also inherits the advantages of knowledge graphs in knowledge modeling and management. As a result, the temporal knowledge map began to ascend the stage of the times. Correspondingly, knowledge modeling and reasoning tasks on temporal knowledge graphs have naturally become current research hotspots, and have great applications in big data analysis. In this paper, temporal knowledge graph is mainly used as the research object to conduct the research work on knowledge modeling and reasoning, and to study and mine the hidden knowledge and evolutionary pattern in the massive temporal information data in various Internet platforms, and to reason and predict the possible facts, which has important application value in different scenarios such as target behavior portrait prediction, intelligence analysis, intelligent question answering, and epidemic prediction. This paper aims to study temporal knowledge graph reasoning tasks from the perspective of interpolation and extrapolation and analyzes and addresses the problems of knowledge modeling, correlation analysis, lack of related historical information, and uncertainty of time reasoning. The research content and main innovations of this paper can be summarized into the following aspects: (1) Research on temporal knowledge reasoning method based on Tucker decomposition This study proposes to use the Tucker decomposition model to model static and temporal knowledge for temporal knowledge reasoning. In order to integrate the prior information of time into the basic model, this paper presents three time regularization methods to constrain the learned time representation so that adjacent timestamps have similar representations. In addition, since the core tensor in the Tucker decomposition model is a three-dimensional tensor with a large number of parameters, in order to prevent the model from overfitting, this paper proposes two embedding regularization methods to assist the basic model in training. Experimental results show that the proposed model achieves considerable experimental performance in temporal knowledge reasoning tasks. (2) Research on temporal knowledge reasoning method based on hierarchical graph attention network This study aims to address the problem of correlation calculation between complex historical events and queries, and then proposes temporal knowledge reasoning method based on hierarchical graph attention network from the perspectives of time and semantics. Firstly, from the perspective of time, this paper proposes that the closer the historical events are to the query in time distance, the greater the impact on the query. A problem with this assumption is that events that are far away in time may have a great impact on the query. In order to solve this problem, this work proposes that related historical events closer to the query in semantics have a greater impact on the query from the perspective of semantics. The models designed from the two perspectives can complement each other's deficiency in correlation calculation. Finally, an aggregation mechanism is designed to integrate the two models, and obtains a more reasonable correlation calculation model for complex historical events and queries. Experimental results show that the designed method achieves considerable results in temporal knowledge reasoning. (3) Research on temporal knowledge reasoning method based on adaptive pseudo-Siamese policy network This study aims to address the lack of historical information related to the query and makes the following observations on the temporal data. The head entity is the subject of the event, and this study can not find directly related historical information based on the head entity, so we can find out indirectly related events from historical events based on the relation in the query. Because the relation of quadruple reflects the semantic information of the event, historical events with the same relation as the query indicate that they have the similar semantics. Following this observation, relation is used to find indirectly relevant historical events for queries without historical information, thereby obtaining semantic action selection. In order to optimize the answers to queries with historical information and queries without historical information, a pseudo-Siamese network is designed to handle these two cases in a unified framework. Experimental results show that the proposed model achieves state-of-the-art performance in handling both cases compared to the baseline models. (4) Research on temporal knowledge reasoning method based on Bayesian hypernetwork collaborating with time-difference evolutional network This study aims to address the uncertainty problem of event time reasoning and proposes to design a reasoning model from the perspective of the encoding of entities and relations, and the decoding of time reasoning. Specifically, the designed model integrates temporal information into the entity and relational representations in the encoding phase, and uses the Bayesian hypernetwork to model the uncertainty of temporal reasoning in the decoding phase. This enables the model not only to conduct the three tasks of entity, relation, and time reasoning in a unified framework but also greatly improves the accuracy of time reasoning. The experimental results show that, compared with the current mainstream time reasoning model, the proposed model has a good improvement in the accuracy of time reasoning. The methods and conclusions of this paper have important guiding significance for further expanding the application of temporal knowledge graphs in real life.
关键词	时序知识图谱，Tucker 分解，相关性计算，伪孪生策略网络，演化学习，知识推理
语种	中文
七大方向——子方向分类	知识表示与推理
国重实验室规划方向分类	人工智能基础前沿理论
是否有论文关联数据集需要存交	否
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/52287
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	邵朋朋. 时序知识建模与推理方法研究[D],2023.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
Thesis.pdf（2403KB）	学位论文		限制开放	CC BY-NC-SA