基于半监督学习的 CT 图像检测算法研究

CASIA OpenIR > 毕业生 > 硕士学位论文

	基于半监督学习的 CT 图像检测算法研究
	王卓
	2020-05
页数	76
学位类型	硕士
中文摘要	近年来，深度学习技术在目标识别、目标检测、目标分割等领域取得了巨大成功，人们将深度学习技术应用到医学图像处理领域，期望深度学习技术能够减轻医生负担，同时提高医生诊断结果的准确度。深度学习技术取得良好的性能，不仅需要优秀的深度学习模型，还需要规模大且质量高的数据集。如果数据集的规模小且标注质量差，深度学习模型性能会变得很差。由于标注医学CT图像成本非常昂贵，医学专家只标注了CT图像中比较明显的病灶，大量的不明显病灶未被标注，使得现有的医学图像数据集中存在标注漏标问题，即数据集的标注质量差。为了降低标注成本，医学专家只标注了数据集中少量的样本，医学图像数据集中很多样本没有标注，即数据集的规模非常小。为了解决上述问题，本文开展了以下工作： 1.针对医学CT图像数据集存在的漏标问题，本文提出了基于病灶连续性的缺失标签挖掘算法，本文首次提出病灶连续性概念，并将病灶连续性融入到缺失标签挖掘过程中，使得算法能够挖掘出已标注CT图像中漏标的病灶，进而缓解医学CT图像数据集的漏标问题。 2.针对医学CT图像数据集存在的已标注CT图像少、未标注的CT图像多的问题，本文提出了基于病灶连续性的标签传播算法，该算法首次将病灶连续性融入到半监督学习算法中，解决了现有的半监督学习方法生成的伪标签质量低的问题，该算法可以为CT图像数据集中无标注的CT图像自动生成高质量的标注，进而解决医学CT图像数据集中未标注样本多的问题。
英文摘要	In recent years, deep learning technology has made tremendous success in the field of object recognition, object detection, object segmentation, etc. People apply deep learning technology to the field of medical image processing, and hope that deep learning technology can reduce the burden on doctors and at the same time improve the accuracy of doctors' diagnosis results. To achieve good performance, not only an excellent deep learning model is required, but also a large-scale and high-quality data set. If the size of the data set is small and the label quality is poor, the performance of the deep learning model will become very poor. Because the cost of labeling medical CT images is very expensive, medical experts only label more obvious lesions in CT images, and a large number of unobvious lesions are not marked, so that the existing medical image data set has a problem of missing label, that is, the quality of the label in the data set is poor. In order to reduce the cost of labeling, medical experts only label a small number of samples in the data set, and many samples in the medical image data set are not labeled, that is, the size of the data set is very small. In order to solve the above problems, this article proposes the following work: 1. In view of the problem of missing labels in medical CT image data sets, this paper proposes a missing label mining algorithm based on lesion continuity. This paper presents the concept of lesion continuity for the first time, and integrates the lesion continuity into the missing label mining process, so that the algorithm can mine the missing lesions in CT images, thereby alleviating the missing labels in medical CT image data sets. 2. In order to solve the problem of a small number of labeled CT images and a large amount of unlabeled CT images in the medical CT image data set, this paper proposes a label propagation algorithm based on lesion continuity. This algorithm integrates lesion continuity into a semi-supervised learning algorithm, it solves the problem of low quality of pseudo-labels generated by the existing semi-supervised learning method. This algorithm can automatically generate high-quality labels for unlabeled CT images in the CT image data set, thereby solve the problem of many unlabeled samples in medical CT image data sets.
关键词	半监督学习缺失标签挖掘标签传播病灶检测
语种	中文
七大方向——子方向分类	图像视频处理与分析
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/40316
专题	毕业生_硕士学位论文
推荐引用方式 GB/T 7714	王卓. 基于半监督学习的 CT 图像检测算法研究[D]. 中国科学院自动化研究所. 中国科学院大学,2020.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
基于半监督学习的 CT 图像检测算法研究（3117KB）	学位论文		限制开放	CC BY-NC-SA