面向认知功能的脑区环路知识图谱自动构建

CASIA OpenIR > 毕业生 > 硕士学位论文

	面向认知功能的脑区环路知识图谱自动构建
	朱洪银
	2017-05-23
中文摘要	随着脑科学领域研究数据的激增，仅仅依靠人工进行文献综述和知识管理已经不能满足需求。一方面，面对长期以来积累的大量脑科学文献和数据，人工阅读文献并且总结知识的效率很低，而且因为存在主观性，很难有一个大家都认同的统一标准；另一方面，人工总结必须有较多的专家知识作为支撑，而许多神经科学家往往集中于较小的研究领域，难以全面兼顾，所以人工总结的方式往往难以推广。然而，认知功能的实现与脑区连接形成的环路有着密不可分的关系，研究不同物种下实现特定认知功能的脑区环路连接方式有助于更好地理解大脑的工作原理。本文的目的是从大规模科学文献中自动抽取出特定物种下的特定认知功能的脑区环路，并且构建出脑区环路的知识图谱。本文的工作一方面可以提升领域专家人工总结的效率，构建一个相对完整的脑区环路知识图谱，辅助神经科学家更好的理解大脑；另一方面可以发现一些新知识并且邀请神经科学家进行实验验证。本文主要从脑科学领域训练数据不充足的问题出发，结合领域词典和半监督的方法进行信息抽取，并且取得了以下的创新性研究成果： 1.将脑科学的知识按照不同物种类别进行组织 Linked Brain Data（LBD）脑科学知识图谱中的知识是没有按照物种类别进行组织的，由于物种之间存在较多的差异，所以许多知识具有物种特异性，从而抽取的脑科学知识不能简单地混在一起，每条知识有其自身适应范围。针对这一问题，本文通过信息抽取的方式，直接从文献中抽取知识所适应的物种范围，并以此构建了脑科学知识引擎。对于直接抽取不到的物种知识，则通过机器学习的方法推理出物种信息作为辅助。 2.提出了一种半监督的脑区环路知识抽取策略认知功能依托于脑区信息处理环路存在，因此研究脑区构成的信息处理环路对于认知功能有着重要的意义。然而可以直接用于这个问题的训练语料相对较少，针对训练语料不充足的问题，我们提出了一种半监督脑区环路抽取策略，能够仅仅依靠几个种子从大规模文献中抽取出较为全面的脑区环路连接。 3.提出了一种知识图谱的关系验证和推理方法由于领域知识是自动地从科学文献中抽取的，不确定性是不可避免的，人们对于大脑的理解可能是不一致的，另外目前的自动知识抽取技术存在一定的局限性。为了提高知识的可信度，我们提出了一种脑知识图谱关系验证和推理的方法，利用知识图谱自身的拓扑结构，对知识的可信度进行验证和推理。在分析的过程中也帮助神经科学家快速地发现一些淹没在大规模知识中的关键的主题。最后本文对脑科学知识图谱的构建技术进行了介绍，内容包括脑区环路知识图谱构建的各个环节，包含数据的获取，数据预处理，信息抽取，本体设计，知识库的构建，数据的发布以及知识图谱的查询等。
英文摘要	With the increase of research data in the field of brain science, manual literature review and knowledge management cannot meet the demand. On the one hand, faced withmassive brain science literature and data accumulated in long run, summarizing knowledge by reading literature manually is in very low efficiency andit is difficult to have an unified standard that everyone agrees with due to the presence of subjectivity; on the other hand, manual summary must have more expert knowledge as a support, while many neuroscientists tend to focus on smaller research field, thus it is difficult to take all the knowledge into account, so the manual summary way is often difficult to promote. Since the formation of cognitive function and the brain circuit are closely related, it helps researchers better understand the brain to study the connections between different brain regions of different cognitive function on various species. This paper is to automatically extract the knowledge of brain circuit of different cognitive functions on various species from large-scale scientific literature, and build cognitive function centric brain circuit knowledge graph. On the one hand, we can improve efficiency of manual summary of domain experts and build a relatively complete knowledge graph of brain circuit to help neuroscientists better understand the brain; on the other hand, we can find new knowledge and invite neuroscientists to test theknowledge through experiments. This paper focuses on the problem of insufficient training data in the field of brain science, and combines the domain dictionary and semi-supervised method to extract relations from text files. 1.Organizing the knowledge from species perspective The knowledge in brain knowledge graph of LBD was not organized from species perspective. The extracted brain science knowledge cannot be simply mixed together, because each has its own range of adaptation. To solve this problem, we directly extract the species information of each fact from literature and build the Brain Knowledge Engine.For those species knowledge which can't be extracted directly, we adopt machine learning methods as an auxiliary approach to infer species information. 2.Semi-supervised method for relation extraction Cognitive functions are closely related to the brain circuit, which is significant to study for understanding the brain. However, the training corpus that can be used directly in this problem is inadequate. To solve this problem, we propose a semi-supervised strategy for brain circuit extraction which extracts relatively complete brain circuit from large-scale literature merely relying on a few seeds. 3.Relation verification and inference on brain knowledge graph Since the domain knowledge is automatically extracted from the scientific literature, the uncertainty of knowledge is inevitable and researchers may be inconsistent on the understanding of the brain. Besides, automatic knowledge extraction techniques currently exist certain limitations. In order to improve the reliability of the knowledge, we propose a method of relation verification and inference, using the topology structure of knowledge graph, to verify existing relations and infer potentialrelations. As forknowledge analysis, it is also helpful for neuroscientists to quickly discover some of the key topics that are overwhelmed in large scale topic vertices. At the end of this paper, the technology of brain knowledge graph construction was introduced, including all aspects of brain circuit knowledge graph, such as data acquisition, data preprocessing, information extraction, ontology design, knowledge base construction, data dissemination and query on knowledge graph.
关键词	知识图谱脑区环路关系抽取认知功能
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/14713
专题	毕业生_硕士学位论文
推荐引用方式 GB/T 7714	朱洪银. 面向认知功能的脑区环路知识图谱自动构建[D]. 北京. 中国科学院研究生院,2017.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
面向认知功能的脑区环路知识图谱自动构建（1939KB）	学位论文		限制开放	CC BY-NC-SA