CASIA OpenIR  > 脑图谱与类脑智能实验室  > 脑网络组研究
基于人脑转录组的基因相互作用结构研究
华娇娇
2021-05-24
页数100
学位类型博士
中文摘要

 

人脑包含数百个解剖和功能上不同的区域,每个区域都由数十亿个异质细胞组成,整个人类大脑转录组的表达模式极为复杂。在微观水平上,成千上万的单个基因的表达通过复杂的相互作用来确定基因组的表达模式。在宏观层面上,数百个不同的大脑脑区之间的基因组表达模式的相互作用导致了整个大脑的所呈现的转录特征。在每个级别,可能存在涉及基因或脑区的成对,三倍体,四倍体等的相互作用,从而对于一个n-基因或者n-脑区系统,其表达谱网络的复杂度将是O(2n)。而这种极高的复杂度限制了传统相关方法的使用,单个脑区的转录组表达模式和整个脑网络表达谱的结构仍均不清楚,阻碍了理解基因表达如何调控大脑的结构、功能、发育和疾病。因此,找到一个简单的定量框架用于刻画单个脑区的全基因组表达模式的结构以及整个脑网络的表达谱具有重大意义。以此为目标,本文通过结合人脑转录组数据分析与建模深入研究了决定基因表达模式相互作用结构,极大地简化了表达模式的分析框架,并在此基础上探索了这一分析框架在脑疾病相关基因研究中的应用。

首先,本文发现是二阶基因相互作用决定了人脑中的层级化转录模式。人脑是一个复杂系统,其结构和功能受到众多基因的精细调控,且各个基因的表达之间存在着错综复杂的相互作用,但是目前对于基因间相互作用的基本规律及其与脑结构、脑功能的关系尚不清楚。通过分析人脑中数千个采样位点,1万多个基因的转录数据,本文发现基因表达的成对(二阶)相互作用可预测单个脑区和整个脑网络的转录模式,表明人脑的基因转录模式是由二阶相互作用所主导。这一发现极大地降低了基因表达网络的复杂度[从O(2n)降至O(n2)]。此外,该研究也揭示了基因表达与脑网络整体性质之间可能存在的深刻关系,即转录组数据中的基因相互作用强度可以导致根据转录特性聚类所获得的脑区数目接近最大,提示进化过程可能选择该作用强度以实现丰富脑结构和功能。

  进一步,基于转录的群体耦合程度度量,本文发现了在基因和脑区层面新的组织模式和结构。通过进一步分析全脑的转录组数据,本文发现单个基因或脑区的表达模式与整体(基因组或脑网络)的表达模式之间的相关程度(群体耦合程度)可以简约的刻画基因表达的二阶相互作用,从而进一步降低了基因表达的相互作用结构的复杂度[从O(n2)降至O(n)]。该研究根据群体耦合程度,发现基因可以分为强耦合的“合唱者”与弱耦合的“独唱者”,而这两种不同的基因群体具有不同的生物学功能。进一步,在脑区层面,该研究发现了根据基因表达的群体耦合程度呈现明显的空间聚类特性与脑区特异性,提示了一种尚未报道过的人脑组织划分模式。

最后,利用上述研究提出的基因表达群体耦合程度,本文识别出在不同脑疾病中相互作用模式显著变化的基因。差异表达基因已经被广泛运用于分析人脑疾病致病机理,然而这种分析无法识别一些表达水平没有显著变化但在转录网络中有着重要角色转变(比如从“合唱者”到“独唱者”)的基因。以阿尔兹海默症(AD)等脑疾病为例,通过分析RNA微阵列数据集,发现在脑疾病组和正常对照组的比较中,许多基因的群体耦合程度发生了变化,本文将这些基因命名为角色变化的基因(RAG),并且发现只有少部分RAG在基因表达水平上有着显著的变化。通过基因功能分析,本文明确了这些RAG可能具有的生物学意义。这一工作证实了在脑疾病的发生发展过程中,除了表达量的变化,一些基因在表达网络中的相互作用方式也会发生变化,从而为脑疾病研究提供了一个重要的新视角。

综上,本文利用参数化模型深入分析了人脑转录组的数据,发现了决定基因相互作用结构的重要因素,提出了在丢失信息最小化前提下的一系列降维方法,大大降低了分析基因表达模式的复杂度,为后续的相关研究提供了重要的理论依据和分析工具,进一步,通过将这一创新的分析框架用于脑疾病相关基因的研究,本文初步验证了其实用意义。我们相信对于基因相互作用结构的深入理解,将在未来极大地促进有关遗传信息如何决定脑网络结构、功能及其发育和异常的研究。

 

英文摘要

The human brain contains hundreds of regions with different anatomical structures and functions, and each region is composed of billions of heterogeneous cells. The expression pattern of the entire human brain transcriptome is extremely complex. At the microscopic level, the expression of thousands of individual genes determines the expression pattern of the genome through complex interactions. At the macro level, the interaction of genomic expression patterns among hundreds of different brain regions results in the transcriptional characteristics of the entire brain. At each level, there may involve paired, triploid, tetraploid, etc. interactions among genes or regions. Therefore, for an n-gene or n-region system, the complexity of its expression profile will be O(2n). However, the extremely high complexity limits traditional related methods. The transcription profile in a single brain area and that in an entire brain network is still unclear, which hinders the understanding of how gene expression regulates brain development, structure, function, and disease. Therefore, it is of great significance to find a simple quantitative framework to describe the structure of the whole genome expression pattern of a single brain area and that of an entire brain network. With this goal, by analyzing and modeling human brain transcriptome data, this article deeply studies the interaction structure that determines gene expression patterns, greatly simplifies the analysis framework of expression patterns, and on this basis, explores the application of this analysis framework in the research of brain disease-related genes.

Firstly, this paper finds that second-order interactions between genes determine the hierarchical transcription patterns in the human brain. The human brain is a complex system, whose structure and function are finely regulated by many genes. In addition, there are intricate interactions among genes. However, the basic rules of interactions among genes and their relationship with brain structure and function are unclear. By analyzing the transcription data of thousands of sampling sites and more than 10,000 genes in the human brain, it is found that the pairwise (second-order) interactions between gene expression can predict the transcription pattern of a single brain area and that of an entire brain network, indicating that the gene transcription pattern of the human brain is dominated by second-order interactions. This discovery greatly reduces the complexity of gene expression networks from O(2n) to O(n2). In addition, the study also revealed the potential relationship between gene expression and the brain network, that is, the strength of gene interaction observed empirically can lead to the nearly maximal number of transcriptional clusters, which may account for the functional and structural richness of the human brain during evolution.

Secondly, based on population coupling in the transcription, new organization and structures have been discovered at the level of genes and regions. Through further analysis of the transcriptome data of the whole brain, it is found that the correlation between the expression pattern of a single gene or region and that of the whole genome or brain network can capture pairwise interactions between genes or areas, which further reduces the complexity of the interaction structure of gene expression from O(n2) to O(n). According to population coupling, genes can be divided into strongly coupled "chorus" and weakly coupled "soloists". Both groups have different biological functions. Furthermore, at the region level, the study found obvious spatial clustering characteristics and region specificity according to the value of population coupling, suggesting an unreported organization of the human brain.

Finally, by using the population coupling of genes proposed above, this paper identifies genes whose interaction patterns change significantly in different brain diseases. Differentially expressed genes have been widely used to analyze the pathogenesis of human brain diseases. However, this analysis cannot identify genes with no significant changes in expression levels but altering their roles in the transcription network (for example, from "chorus" to "soloists"). Take brain diseases such as Alzheimer's (AD) as an example, through analyzing the RNA microarray dataset, it is found that in the comparison between the brain disease group and the normal control group, population coupling of many genes has changed. These genes are named role alternation genes (RAG) and only a minority of RAG are found to change significantly in gene expression level. Through gene enrichment analysis, this study clarified the biological significance of RAG. This work confirms that during the development of brain diseases, except for genes that change in expression levels, some genes will alter the way of interaction in the expression network, which will provide a new important perspective for the study of brain diseases.

In summary, this article uses parametric models to analyze the data of the human brain transcriptome, finds important factors that determine the structure of gene-gene interactions. Besides, this article proposes a series of dimensionality reduction methods under the premise of minimizing the loss of information, which greatly reduces the complexity when analyzing gene expression patterns. It also provides important theoretical foundation and analysis tools for subsequent related research. Further, by applying this innovative analysis framework to the study of brain disease-related genes, this article initially verifies its practical significance. We believe that the in-depth understanding of the structure of gene interaction will greatly promote the research on how genetic information determines the structure, function, development, and abnormality of brain networks in the future.

关键词基因相互作用 功能连接 群体耦合 脑疾病
语种中文
七大方向——子方向分类脑网络分析
文献类型学位论文
条目标识符http://ir.ia.ac.cn/handle/173211/44799
专题脑图谱与类脑智能实验室_脑网络组研究
推荐引用方式
GB/T 7714
华娇娇. 基于人脑转录组的基因相互作用结构研究[D]. 中国科学院自动化研究所. 中国科学院自动化研究所,2021.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
华娇娇毕业论文.pdf(17488KB)学位论文 开放获取CC BY-NC-SA
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[华娇娇]的文章
百度学术
百度学术中相似的文章
[华娇娇]的文章
必应学术
必应学术中相似的文章
[华娇娇]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。