面向公共卫生管理决策的知识图谱循证研究

CASIA OpenIR > 毕业生 > 硕士学位论文

	面向公共卫生管理决策的知识图谱循证研究
	杨芸榕
	2022-05-22
页数	94
学位类型	硕士
中文摘要	近年来，循证公共卫生决策由于其依靠证据进行决策的特点，逐渐成为公共卫生领域政策制定的重要方式。尤其是新冠肺炎疫情暴发以来，与新冠肺炎疫情相关的科学研究文献、新闻、社交媒体等文本信息不断涌现，这为循证公共卫生决策奠定了广阔的证据基础。但是，传统的循证公共卫生决策一般采用人工筛选证据，而进入数智时代，面对海量的原始证据文本，需要采用智能化、自动化技术来对证据进行挖掘、抽取和综合。在此背景下，本文探索基于海量、多源的公共卫生证据信息的自动知识图谱循证，为公共卫生管理决策提供支撑。该研究工作对于提升公共卫生领域证据获取的效率，推动信息抽取技术在循证公共卫生领域的应用，具有重要的应用价值。本论文分别基于公共文献、公共卫生微博和流调报告进行知识图谱循证。论文的主要工作和创新点归纳如下： 1. 基于公共卫生文献的知识图谱循证新冠肺炎疫情期间，评估非药物干预措施的有效性对于疫情防控至关重要。为了解决这个问题，研究人员于疫情早期发表了大量新冠肺炎疫情相关的建模研究文献，这类文献通过建模研究的方法模拟不同国家和地区在不同时间段时非药物干预措施施行后产生的影响。在不具备随机对照实验研究的条件下，这类文献被视为较为可靠的循证公共卫生决策的证据来源。为此，本文基于该类文献的摘要标注并构建篇章级信息抽取数据集，并基于该数据集提出了一种融合半监督学习算法的篇章级信息抽取方法。该方法以一种先进的多任务信息抽取模型为基础，能够充分抽取篇章级的实体、关系和共指实体等相关信息，只需较少的标注数据就可以获得较高的准确率。相比基线模型，本文提出的方法在实体抽取任务上的F1得分相对提高5.3%，在关系抽取任务上的F1得分相对提高4.5%。此外，为了支撑循证公共卫生决策，本文呈现了两个案例分析，分别是封城有效性的效果评估荟萃分析以及非药物干预措施效果的知识图谱，验证了上述信息抽取方法的有效性。 2.基于公共卫生微博的知识图谱循证当疫情防控政策施行后，社交媒体中的文本信息一定程度上会反映政策导致的结果，这可以作为政策制定和调整的参考依据。为了探究新冠肺炎疫情期间施行的一系列疫情防控措施产生的影响，本文基于公共卫生相关微博，标注同一个句子中的疫情防控政策及其结果事件，构建因果关系抽取数据集。并且，本文利用一种基于文本片段排列的实体关系抽取模型进行因果关系抽取。实验结果表明尽管该模型结构简单，但是性能优越，适合该任务场景。在精确匹配的评价指标下，“影响”、“防控政策”以及“防控政策子类型”的F1得分分别提高0.04、0.09以及0.09， “防控政策-影响”以及“防控政策子类型-影响”的F1得分均提高0.11。以我国新冠肺炎疫情早期施行的非药物干预措施政策进行案例分析，分析结果表明知识图谱循证可以有效呈现防控政策及其影响之间的关系。 3.基于流行病学调查报告的知识图谱循证基于流行病学调查报告生成的确诊病例传播关系图谱可以有效呈现病毒传播风险，明确每个确诊病例感染路径，确定可能的感染人群以及每场疫情的类型和特点，因此对于疫情防控尤其是聚集性疫情的防控意义重大。由于现存确诊病例传播关系图谱并无特定的设计规则，且不太适用于循证决策的需求。本文依据聚集性疫情调查指南设计一种简洁明晰的传播关系图谱结构，并利用一种基于规则信息抽取方法来自动抽取确诊病例属性和社会关系。基于一起大规模跨省传播聚集性疫情进行案例分析，表明本文设计的图谱可以有效呈现传播链，对于疫情防控起到积极作用。
英文摘要	In recent years, evidence-based public health decision-making has gradually become an important way of policy making in the field of public health because of its characteristic of relying on evidence. Especially since the outbreak of COVID-19, scientific research literature, news, social media and other text information related to COVID-19 have been constantly emerging, which has laid a broad evidence base for evidence-based public health decision-making. However, traditional evidence-based public health decision-making generally adopts manual screening of evidence. In the era of digital intelligence, faced with massive original evidence texts, it is necessary to adopt intelligent automation technology to excavate, extract and synthesize evidence. In this context, this paper explores evidence-based automatic knowledge mapping based on massive multi-source public health evidence information to provide support for public health management decisions. This research has important theoretical significance and application value for improving the efficiency of evidence acquisition in public health field and promoting the application of information extraction technology in evidence-based public health field. Based on public health literature, microblog and epidemiological survey report, this paper carried out evidence-based knowledge graph. The main work and innovations of this paper are summarized as follows: 1. Evidence-based Knowledge Graph based on Public Health Literature During COVID-19 outbreak, evaluating effectiveness of non-pharmaceutical interventions for epidemic prevention and control is very important. In order to solve this problem, researchers in the early outbreak have published a large number of modeling research literature. Such literature uses modeling studies to simulate the impact of non-pharmaceutical interventions in different countries and regions at different time periods. Under the condition of not having randomized controlled study, this kind of literature is seen as a reliable source of evidence of evidence-based public health decisions. Therefore, this article is based on the literature of the annotation and build discourse level information extraction data sets, and puts forward a kind of fusion based on the data set a semi-supervised learning algorithm of discourse level information extraction method. Based on an advanced multi-task information extraction model, this method can fully extract relevant information such as document-level entity relationship and coreference reference entity, and only need less annotation data to achieve high accuracy. Compared with the baseline model, the F1 score of entity extraction task proposed in this paper increased by 5.3% and that of relation extraction task increased by 4.5%. In addition, in order to support evidence-based public health decision-making, this paper presents two case studies, namely, a meta-analysis of the effectiveness evaluation of the lockdown and an evidence map of the effectiveness of non-drug interventions, to verify the effectiveness of the above information extraction method. 2. Evidence-based Knowledge Graph based on Public Health Microblog After the implementation of epidemic prevention and control policies, text information on social media will reflect the results of the policies to a certain extent, which can serve as a reference for policy formulation and adjustment. To explore COVID - 19 during outbreak of a series of the impact of the epidemic prevention and control measures, this article is based on public health related weibo, annotation events, epidemic prevention and control policy and its results in the same sentence building causality extract data set. And we use a span-based entity relation extraction model for causal relation extraction. Experimental results show that although the model has a simple structure, its performance is superior, and it is suitable for this task scenario. Under the evaluation indexes that are precisely matched, the F1 scores of influencing prevention and control policies and the sub-types of prevention and control policies are increased by 0.04, 0.09 and 0.09 respectively. The F1 score of both prevention and control policy-impact and subtypes of prevention and control policy-impact increased by 0.11. A case study was conducted on the non-pharmaceutical intervention policies implemented in the early stage of COVID-19 in China. The analysis results showed that the evidence-based knowledge mapping could effectively present the relationship between prevention and control policies and their impact. 3. Evidence-based Knowledge Graph based on Epidemiological Survey Reports The transmission relationship map of confirmed cases generated based on epidemiological investigation reports can not only effectively represent the risk of virus transmission, but also identify the infection path of each confirmed case, the possible infected population, and the types and characteristics of each epidemic. Therefore, it is of great significance for epidemic prevention and control, especially for cluster outbreaks. Since there is no specific design rule for the existing transmission relationship map of confirmed cases, and it is not suitable for evidence-based decision making, this paper designed a concise and clear transmission relationship map structure according to the cluster epidemic investigation guidelines, and used a rule-based information extraction method to automatically extract the attributes and social relationships of confirmed cases. A case study based on a large-scale inter-provincial transmission cluster showed that the map designed in this paper could effectively represent the transmission chain and play a positive role in epidemic prevention and control.
关键词	COVID-19
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/48935
专题	毕业生_硕士学位论文
推荐引用方式 GB/T 7714	杨芸榕. 面向公共卫生管理决策的知识图谱循证研究[D]. 中国科学院自动化研究所. 中国科学院自动化研究所,2022.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
yyr毕业大论文0610.pdf（3288KB）	学位论文		限制开放	CC BY-NC-SA