CASIA OpenIR  > 数字内容技术与服务研究中心
基于数据与知识双驱动的智能医疗问答系统研究
李文博
2021-05-30
页数75
学位类型硕士
中文摘要

高血压是最常见的一种慢性病,《中国心血管健康与疾病报告》显示2019年我国高血压患者的人数已高达2.45亿,并且患病率总体仍呈增高的趋势。高血压还是心脑血管疾病最重要的危险因素,救治不及时甚至会引发猝死。但是,普通患者往往对于高血压缺乏充分的认识,缺乏途径获取相关有效信息,对于一些偏远地区等一些医疗条件受限地区更是如此。基于以上问题,本文以高血压知识图谱为知识源,并结合患者电子病历数据、物联网实时健康生理指标数据,研发了一种基于数据与知识双驱动的高血压医疗问答系统。本文的主要工作和创新点归纳如下:

1. 提出了一种基于深度学习和词典的实体识别方法

实体识别是构建知识图谱和理解用户问句重要的一步,针对高血压实体识别数据集较小的问题,以及对数据源特点、应用目的的分析,本文提出一种启发式算法,该方法融合BERT-BiLSTM-CRF深度学习模型和基于高血压领域词典的双向匹配算法,准确率和查全率分别达到94.692.8,通过对比实验和消融实验验证了所提方法的有效性。

2. 设计并构建了一个高血压知识图谱

对于高血压知识图谱相关研究十分缺乏的问题,本文基于领域知识图谱构建框架,并结合高血压领域知识特点构建了高血压知识图谱构建框架。本文参照斯坦福大学的特定领域本体构建七步法,在医疗专家指导下构建了高血压模式层,然后将高血压数据源分为结构化数据和非结构化数据完成高血压知识图谱的生成,最后基于Neo4j图数据库实现高血压知识图谱的存储、管理及可视化。

3. 搭建了一个基于数据与知识双驱动的高血压医疗问答系统

本文结合患者电子病历数据、物联网可穿戴设备产生的实时健康生理指标数据以及历史数据,并通过实体识别、实体消融和意图识别实现对患者问句的深层次、全面语义理解,在构建完成的高血压知识图谱查询得到所需答案,实现满足用户需求的高血压智能问答系统。

英文摘要

Hypertension is the most common chronic disease. Annual Report on Cardiovascular Health and Disease in China shows that the number of hypertensive patients in China has reached 245 million in 2019, and the overall prevalence rate is still increasing. Hypertension is also the most important risk factor of cardiovascular and cerebrovascular diseases. Untimely treatment may even lead to sudden death. However, ordinary patients are often lack of adequate understanding of hypertension, lack of access to relevant effective information, especially for some remote areas and other areas with limited medical conditions. Based on the above problems, this paper takes the knowledge graph of hypertension as the knowledge source, combined with the patient's electronic medical record data and the real-time physiological index data of Internet of things, develops a hypertension medical question answering system based on data and knowledge. The main work of this paper is summarized as follows:

1. Propose an entity recognition method based on deep learning and dictionary

Entity recognition is an important step in building knowledge graph and understanding user questions. Aiming at the problem of small data set of hypertension entity recognition, as well as the analysis of data source characteristics and application purpose, this paper uses a heuristic algorithm, combined with BERT-BiLSTM-CRF deep learning model and bidirectional matching algorithm based on hypertension domain dictionary. The accuracy and recall are 94.6 and 92.8 respectively. Through comparative experiments and ablation experiments, the model can be applied to hypertension named entity recognition.

2. Design and construct a knowledge graph of hypertension

For the lack of related research on the knowledge map of hypertension, this paper constructs the framework of hypertension knowledge graph based on the domain knowledge graph construction framework and the characteristics of hypertension domain knowledge. Referring to the seven steps of domain specific ontology construction of Stanford University, this paper constructs the hypertension pattern under the guidance of medical experts, divides the hypertension data source into structured data and unstructured data to complete the generation of hypertension knowledge graph, and finally realizes the storage, management and visualization of hypertension knowledge Graph based on Neo4j database.

 

3. Built a hypertension medical question answering system based on data and knowledge

By combining patient's electronic medical record data, real-time health physiological index data and historical data generated by the Internet of things wearable devices, we can realize the deep and comprehensive semantic understanding of the patient's questions. obtain the required answers in the constructed hypertension knowledge graph query, and realize the hypertension intelligent question answering system that meets the needs of users. The answers are obtained by searching in the constructed graph of hypertension knowledge. Finally, we build a hypertension intelligent question answering system to meet the needs of users.

关键词高血压 问答系统 知识图谱 深度学习 命名实体识别
语种中文
七大方向——子方向分类人工智能+医疗
文献类型学位论文
条目标识符http://ir.ia.ac.cn/handle/173211/44974
专题数字内容技术与服务研究中心
推荐引用方式
GB/T 7714
李文博. 基于数据与知识双驱动的智能医疗问答系统研究[D]. 中国科学院自动化研究所. 中国科学院自动化研究所,2021.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
李文博-终版6.7.pdf(2865KB)学位论文 开放获取CC BY-NC-SA
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[李文博]的文章
百度学术
百度学术中相似的文章
[李文博]的文章
必应学术
必应学术中相似的文章
[李文博]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。