CASIA OpenIR  > 毕业生  > 硕士学位论文
Alternative TitlePattern Mining based on Biological Network
Thesis Advisor杨一平
Degree Grantor中国科学院研究生院
Place of Conferral中国科学院自动化研究所
Degree Discipline计算机应用技术
Keyword人工智能 数据挖掘 特征选择 生物网络 证候模式 问诊模型 表型 基因 Artificial Intelligence Data Mining Feature Selection Biological Network Phenotype Pattern Phenotype Gene
Abstract模式识别是人工智能学科中的一个重要的研究领域,运用数据挖掘方法解决研究对象的模式识别问题,称为模式挖掘。中医在冠心病诊断和治疗方面有着完善的理论基础和成熟的方法体系,然而中医的基本概念、理论和方法是建立在中国古代阴阳八卦哲学基础上的,对证候模式的描述难以理解和量化,其知识是以非结构化的形式存在的。本文以证候、表型、基因等实体所构成的生物网络为基础,着重研究中西医在冠心病诊疗中相关概念的关联关系,采用特征选择,关系提取等方法,构建了证候-表型、证候-基因关联模式,对其关系提取和计算方法进行了深入研究,取得了一定的成果。本文主要工作分为以下几个部分: 一、 基于特征选择的模式挖掘 ·特征选择构建冠心病证候-表型模式构建:在疾病证候-表型模式构建中,各表型(特征)互相依赖、互相影响,常见的特征选择方法已不再适用,本文提出了基于改进的Markov Blanket算法来分析中医证候与表型的关联关系,确定与证候相关的表型集合,构建证候-表型关联模式。 ·分类算法构建证候问诊模型:在确定了证候-表型模式后,以冠心病作为实例,分别使用神经网络、支持向量机、决策树和贝叶斯网络构建证候分类器,对给定的病例数据判定其证候诊断结果,从而实现证候问诊模型。 二、 基于文本挖掘和推理网络的模式挖掘 ·利用于文本挖掘构建表型-基因模式:充分利用OMIM数据库精心维护、更新及时、可靠性较高等特点,采用文本挖掘的方法提取隐含其中的表型-基因关联关系,并将这些关系作为构建表型-基因关联模式的基础。 ·利用标签传播算法挖掘潜在关联模式:OMIM数据库所收录的表型和基因数量有限,对于没有包含在其中的表型与基因间的关系,本文利用基因间隐含在蛋白质反应网络中的拓扑结构信息,采用网络标签传播算法挖掘潜在表型-基因关联模式,提出了相应的算法并给出了预测结果。
Other AbstractThe theory of Artificial Intelligence (AI) has been thoroughly researched and successfully applied to the extraction of relationship between all kinds of items. Traditional Chinese Medicine(TCM) and western medicine have got their own theoretical basis and well developed systems on disease diagnosis and therapy, but some of the items in TCM are based on philosophical concepts of Ancient China, so they are difficult to be interpreted and hard to be quantitated. In this paper we focused on the relationship of the biological network of Zheng、 phenotypes and genes, intended to draw the patterns of Zheng-phenotype and phenotype-gene. We proposed two algorithms for the problems that existed in the patterns generation and presented the result using our methods. The main work of this paper contains: 一、 Pattern mining based on feature selection ·Zheng-phenotype patterns mining based on feature selection: The correlative dependence and influence of phenotypes is a big problem in the construction of Zheng-phenotype, normal feature selection algorithms cannot be used here. We proposed an improved feature selection algorithm based on Markov Blanket and used it to analysis the correlation between Zheng and phenotypes calculate the feature subset against Zheng and generate patterns of Zheng-phenotypes. ·Construction of diagnose model based on classification: Based on the patterns of Zheng-phenotype, we trained six classifiers using Bayesian network, Naive Bayesian, logistic regression, support vector machine(SVM), K-nearest neighbor(KNN) and decision tree, and presented the classification results given new patients' records. 二、Pattern mining based on text mining and inference network ·Construction of phenotype-gene patterns based on text mining: The records of Online Mendelian Inheritance in Man(OMIM) are manually maintained by experts in the field and have high reliability, we used the records in our paper to mine the relationship between phenotypes and genes. The relationship mining from OMIM were treated as the foundation of the phenotype-gene patterns. ·Mining the implied patterns using Label Propagation algorithm: The patterns mining from OMIM only cover a small part of the phenotypes and genes. For the rest of the phenotypes and genes, we proposed a Label Propagation algorithm based on the topology of protein–protein interactions(PPIs) network to generate the phenotypes-genes patterns.
Other Identifier200928014629091
Document Type学位论文
Recommended Citation
GB/T 7714
左晓晗. 基于生物网络的关联模式挖掘方法研究[D]. 中国科学院自动化研究所. 中国科学院研究生院,2012.
Files in This Item:
File Name/Size DocType Version Access License
CASIA_20092801462909(1976KB) 暂不开放CC BY-NC-SAApplication Full Text
Related Services
Recommend this item
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[左晓晗]'s Articles
Baidu academic
Similar articles in Baidu academic
[左晓晗]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[左晓晗]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.