基于Unet编码块迁移学习的胃印戒细胞癌诊断研究

	基于Unet编码块迁移学习的胃印戒细胞癌诊断研究
	李聪
	2021-05-17
页数	84
学位类型	硕士
中文摘要	胃癌是我国高发癌种之一，其中胃印戒细胞癌（Signet-ring Cell Carcinoma）是胃癌的一种组织学亚型，属于高度恶性肿瘤，具有侵袭力强、病程进展快的特点。胃印戒细胞癌的精准诊断对患者治疗方案的制定具有重要指导意义，基于现有医学影像难以诊断胃印戒细胞癌，临床上通过穿刺活检确诊胃印戒细胞癌，费时费力且存在采样误差等缺陷，缺乏快速且无创的诊断工具。深度学习能够基于CT（Computed Tomography）影像提取肿瘤定量特征，进而开展肿瘤的无创诊断，但由于胃癌病灶形状不规则，深度学习模型的精度受到病灶周围无关组织的影响。针对上述挑战，本文在迁移学习机制下建立了一个非侵入性的人工智能模型，开展了胃印戒细胞癌的诊断和术后化疗疗效预测分析。具体的，本文的创新工作包含以下三个方面：1）开展了基于窗位窗宽参数的CT图像预处理研究；2）提出了基于Unet编码块迁移学习构建胃印戒细胞癌深度学习模型的方法；3）建立了胃印戒细胞癌多层次特征融合诊断模型。本文的主要工作和贡献如下： 1、针对CT影像中包含大量噪声信息的问题，本文开展了基于窗位窗宽参数的图像预处理研究。该方法创新地将临床医生的先验知识应用于数据，排除了图像中无关组织的影响，提高了图像的对比度。在此基础上，本文针对每个CT影像数据提取了448个预定义特征，并基于显著特征建立了一个胃印戒细胞癌影像组学诊断模型。模型的AUC（Area Under the Receiver Operating Characteristic Curve）值在测试集（257名患者）中达到0.700，相较于单特征模型，精度提高了4.8%。 2、针对深度学习模型会纳入病灶感兴趣区域外无关组织信息的问题，本文提出了基于Unet编码块迁移学习构建胃印戒细胞癌深度学习诊断模型的方法。创新包含以下内容：首先，本文采用Unet模型作为语义分割网络，训练了一个病灶分割模型；然后，在Unet编码块结构基础上附加三个全连接层，将其作为深度学习诊断模型。深度学习模型的参数使用Unet编码块权重进行初始化，然后通过微调得到稳定的胃印戒细胞癌深度学习诊断模型。该策略为模型引入了病灶空间注意力机制，使得胃印戒细胞癌深度学习模型能够提取到肿瘤的特异性信息。该模型在测试集上的AUC值达到0.716，相对于影像组学模型和随机初始化的深度学习模型，精度分别提高了2.3%和1.7%。 3、针对于预定义特征难以全面刻画肿瘤异质性的问题，本文建立了融合预定义特征、深度学习特征和显著临床因子的融合诊断模型，在胃印戒细胞癌精准诊断问题上具有良好的性能。融合模型的AUC值、敏感性和特异性分别为0.786、77.3％和69.2％，所有指标显著优于影像组学模型、深度学习模型和临床模型。此外，大量的实验验证了融合模型的鲁棒性。进一步的研究结果表明，融合模型与病理确定的胃印戒细胞癌标签在预测总生存期方面具有一致的性能。更重要的是，在经过病理学证实的晚期胃印戒细胞癌患者中，模型能够筛选出可从化疗中获益的患者。综上所述，本文提出了基于迁移学习机制的深度学习模型和融合多层次特征建模方法。基于该方法构建胃印戒细胞癌非侵入诊断模型，模型在胃印戒细胞癌诊断和化疗疗效评估方面具有良好的性能。相关方法本人以第一作者发表于IEEE Journal of Biomedical and Health Informatics（中科院二区，影响因子：5.223），以共同第一作者在本领域国际主流期刊发表一篇SCI论文。
英文摘要	Gastric cancer is one of the most common cancers in our country. Signet-ring cell carcinoma (SRCC) is a histological subtype of gastric cancer. SRCC is a highly malignant tumor with the characteristics of strong invasiveness and rapid progression. Accurate diagnosis of SRCC can guide the formulation of treatment plans for patients. It is difficult to diagnose SRCC based on existing medical images. Clinically, biopsy is used to diagnose SRCC, which is time-consuming and labor-intensive and has defects such as sampling errors. Therefore, it lacks rapid and non-invasive diagnostic tools. Deep learning can extract quantitative features of tumors based on computed tomography (CT) images to realize non-invasive diagnosis of tumors. However, due to the irregular shape of gastric cancer lesions, the accuracy of the deep learning model is affected by irrelevant tissues around the lesions. In response to the above challenges, this paper developed a noninvasive artificial intelligence model under the transfer learning scheme to diagnose SRCC and stratify the risk of postoperative chemotherapy resistance. Specially, this paper proposed new methods from the following three perspectives: 1) carried out CT image preprocessing based on window level and width parameters; 2) proposed a method for building a deep learning model based on Unet encoder transfer learning; 3) established a merged model by integrating multi-level features. The main innovations and contributions of this paper are as follows: 1. In terms of the problem that CT images contain a large amount of noise, this paper carried out image preprocessing based on window level and width parameters. This method innovatively utilized the prior knowledge of the clinician to eliminate the influence of irrelevant tissues in the image and improve the contrast of the image. Furthermore, this paper extracted 448 hand-crafted features, and established a SRCC radiomics diagnosis model based on significant features. The area under the receiver operating characteristic curve (AUC) of the radiomics model reached 0.700 in the test set (257 patients), which was an increase of 4.8% compared with models based on single feature. 2. Aiming at the problem that the deep learning model will influenced by the irrelevant tissue outside the area of interest of the tumor, this paper proposed a strategy for developing a SRCC deep learning model under the transfer learning scheme. The innovation includes the following content: First, this paper adopted Unet as a semantic segmentation network and performed a segmentation task. Second, this paper appended three fully connected layers behind the encoder network of Unet and termed it as the deep learning diagnosis model. The weight of deep learning model was initialized by Unet and fine-tuned for few epochs. This strategy introduced a spatial attention mechanism to the model, so that the deep learning model could extract specific characteristics of the tumor. The deep learning model demonstrated an AUC of 0.716 in the test set, which is 2.3% and 1.7% higher than the radiomics model and the randomly initialized deep learning model, respectively. 3. To solve the problem that hand-crafted features are difficult to comprehensively characterize tumor phenotypes, this paper constructed a SRCC merged diagnosis model by integrating hand-crafted features, deep learning features and significant clinical factors. In the test set, the AUC, sensitivity, and specificity of merged model for diagnosing SRCC was 0.786, 77.3%, and 69.2%, respectively. All indicators are significantly higher than the radiomics model, deep learning model and clinical model. In addition, extensive experiments demonstrated the robustness of the merged model. Furthermore, this research found that the pathologically-determined SRCC status and merged model yielded comparable performance in predicting overall survival. Importantly, in pathologically-confirmed advanced SRCC patients, model can screen out patients who can benefit from chemotherapy. To sum up, this paper proposed a method for building a deep learning model under the transfer learning mechanism and a method for constructing a merged model by integrating multi-level features. This method is utilized to construct a non-invasive diagnosis model, which demonstrated good performance for diagnosing SRCC and predicting chemotherapy responses. Based on related methods, I published one SCI paper in IEEE Journal of Biomedical and Health Informatics as the first author, and published one SCI paper in the mainstream international journal as the co-first author.
关键词	胃印戒细胞癌迁移学习深度学习诊断生存期
语种	中文
七大方向——子方向分类	医学影像处理与分析
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/44764
专题	中国科学院分子影像重点实验室
推荐引用方式 GB/T 7714	李聪. 基于Unet编码块迁移学习的胃印戒细胞癌诊断研究[D]. 北京市海淀区中国科学院自动化研究所智能化大厦910. 中国科学院大学-中国科学院自动化研究所,2021.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
基于Unet编码块迁移学习的胃印戒细胞癌（3289KB）	学位论文		开放获取	CC BY-NC-SA