CASIA OpenIR  > 毕业生  > 硕士学位论文
多模态超声时空特征互作融合智能分析算法及其辅助临床诊断研究
孟哲令
2023-05-17
页数78
学位类型硕士
中文摘要

超声影像是一种利用超声声束扫描生物体,通过对反射信号的接收和处理进而获得体内器官图像的成像工具,具有方便经济、无毒无害等的优点。常见的超声成像模态包括B模式超声、彩色多普勒超声、弹性超声和超声造影,它们以不同的视角、从不同的方面刻画了患者病灶的空间与时间特征,能够帮助临床医师更加全面的认识病灶从而有望作出更为准确的临床诊断结果。然而,超声多模态影像的综合分析考验着临床医师的专业知识水平和临床实践经验,是一项具有挑战性的任务。借助人工智能方法可以实现多模态超声的智能化融合分析,但现有研究工作由于缺乏对各超声模态具体特点的考量,在超声模态时空特征的关联性和差异性建模方面还缺乏有效的算法设计,使得相关方法在有效构建超声模态间的联系和有效处理超声模态同质和异质信息上存在困难,其性能表现还有进一步提升的空间。本文围绕超声多模态的融合分析这一主题,针对性提出了根据超声具体特点和模态间联系与区别来设计超声模态时空特征相互协作方法进而实现融合分析的精细化算法设计的主旨思想,在前沿算法研究和临床应用研究部分分别提出了用于四种超声模态融合分析的多步模态融合网络和用于B模式超声与彩色多普勒超声两种最常见超声模态融合分析的双模态超声层次化诊断网络,为临床上不同难度水平的诊断问题提供了不同的超声模态融合分析算法。具体来说,本文工作共包含三个部分。

 

在第一部分,本文构建了多模态超声影像数据集。我们同兰州大学第二医院等单位合作,在严格的患者纳入-排除标准指导下,回顾性收集了多中心的颈部淋巴结肿大患者病理类型数据集和单中心的转移性颈部淋巴结肿大患者病理亚型数据集。多位具有丰富临床诊断经验的超声影像科医师对所采集患者的超声影像进行了感兴趣区标注。本文对不同模态的超声影像设计了不同的预处理方法,并使用这些方法对超声影像各模态进行了预处理。通过数据采集、标注和预处理步骤,本文构建起了两个多模态超声影像数据集,为后续开展算法研究奠定了基础。

 

在第二部分,本文开展了前沿算法研究工作。针对B模式超声、彩色多普勒超声、弹性超声和超声造影四种超声模态的融合分析问题,我们提出了多模态超声多步模态融合网络MSMFN,其主要包含三个创新点。一是采用了分组分步融合策略,将相似性高的超声模态分为一组,率先进行组内模态的融合,然后再进行组间模态的融合,为后续关联超声模态时空特征奠定框架基础;二是提出了模态交互指导机制,通过模态间跨语义的空间级特征关联,实现网络基于从B模式超声获取的知识来指导彩色多普勒超声与弹性超声的特征提取过程;三是提出了模态特征正交自监督方法,通过监督不同组的模态特征在同一特征隐空间中呈现相互正交的空间关系,指导网络从不同组模态中分别提取相互不重叠的异质性特征,进而提升各超声模态在下游临床诊断问题中的价值和利用效率。我们在具有挑战性的转移性颈部淋巴结肿大病理亚型区分问题上开展了方法对比与网络各组成成分消融实验。实验结果表明,本文所提出的方法在综合诊断能力上显著优于临床医师和其他相关方法,特别是在较难区分的类别上取得了相较其他方法5-10%以上的精度提升。MSMFN为多模态超声融合分析的人工智能辅助临床诊断提供了新的思考和新的方法。

 

在第三部分,本文开展了临床应用研究工作。在这一部分,我们贴近临床实际问题和临床实际条件,面向颈部淋巴结肿大病理分型这一常见临床诊断问题,基于最常使用的两种超声模态B模式超声和彩色多普勒超声,设计了层次化诊断网络CLA-HDN。CLA-HDN采用两层诊断结构对诊断任务进行分解,每一层由一个或两个具有相同结构但又相互独立的子网络构成,用以具体承担不同的诊断任务。每一个子网络采取双分支结构,使用具有模态间特征图通道自适应对齐功能的特征重要性度量和传递机制关联双模态超声的特征提取过程。CLA-HDN结构的设计解藕了不同病理类型区分中的特征提取过程,为平衡其在各类别上的诊断能力并进而提升模型的可解释性和辅助临床诊断能力创造了条件。我们重点关注了CLA-HDN的临床应用价值,细致地设计了两阶段的多中心多年资人机对比实验流程。具有不同诊断水平的六位超声医师分别进行了独立的和在CLA-HDN给出的关注度热图和诊断结论辅助下的诊断。实验结果显示,CLA-HDN在各中心医院的颈部淋巴结肿大病理分型数据上不仅展现出与高年资超声医师水平相当的诊断能力,还能提升和具有统计显著性地提升高年资和中低年资超声医师的诊断水平。CLA-HDN的提出,为在欠发达和发展中国家与地区开展可复制、可推广的人工智能辅助颈部淋巴结肿大病理诊断提供了一种方案。

 

本文在第二部分和第三部分所开展的研究工作分别发表于医学影像分析与处理领域期刊IEEE Transactions on Medical Imaging(SCI 1 区,IF 11.037,CCF B类期刊)和医学领域期刊BMC Medicine(SCI 1 区,IF 11.150),本文作者分别为第一作者和共同第一作者。

英文摘要

Ultrasound is an imaging tool that uses ultrasound beams to scan an organism and obtains images of organs in the body by receiving and processing the reflected signals. It has the advantages of being convenient, economical, non-toxic and harmless. Common ultrasound imaging modalities include B-mode Ultrasound, Color Doppler Flow Imaging, Ultrasound Elastography and Dynamic Contrast-Enhanced Ultrasound. They convey the different spatial and temporal features of a patient's lesion from different perspectives and aspects, and can help clinicians to understand the lesion more comprehensively so that to be expected to make a more accurate diagnosis. However, the existing researches lack effective algorithm design for modeling the correlation and difference of spatio-temporal features between ultrasound modalities due to the lack of the consideration of specific characteristics of ultrasound modalities. It makes them difficult to effectively construct the connection between ultrasound modalities and effectively process the homogeneous and heterogeneous information of ultrasound modalities. Therefore, there is still room for further improvement of the performance of the methods. This thesis focused on the theme of fusion analysis of ultrasound modalities, and proposed the main idea of constructing a refined algorithm by designing a method for the collaboration and cooperation of ultrasound spatio-temporal features according to the specific characteristics of ultrasound and the connection and distinction between the modalities, thus realizing the fusion analysis. A multi-step modality fusion network for the fusion analysis of four ultrasound modalities and a dual-modality ultrasound hierarchical diagnostic network for the fusion analysis of the two most common ultrasound modalities, B-mode Ultrasound and Color Doppler Flow Imaging, were proposed in the frontier algorithm research and clinical application research, respectively. The researches provided different ultrasound modal fusion analysis algorithms for diagnostic problems of different difficulty levels in the clinic. Specifically, the work in this thesis consisted of three parts.

 

In the first part, two multi-modal ultrasound datasets were constructed. In collaboration with Lanzhou University Second Hospital and other centers, we retrospectively collected a multi-center dataset of patients with four pathological types of Cervical Lymphadenopathy and a single-center dataset of patients with two pathological subtypes of metastatic Cervical Lymphadenopathy under the strict guidance of the patient inclusion-exclusion criteria. Several ultrasound radiologists with extensive clinical diagnostic experience annotated the regions of interest of the ultrasound data. Different preprocessing methods were designed for different modalities and were used to preprocess each ultrasound modality. Through the data acquisition, annotation and preprocessing the three steps, two multi-modal ultrasound datasets were constructed, which laid the foundation for the subsequent algorithm researches.

 

In the second part, a frontier algorithm research was carried out. For the problem of fusion analysis of the four ultrasound modalities, namely B-mode Ultrasound, Color Doppler Flow Imaging, Ultrasound Elastography and Dynamic Contrast-Enhanced Ultrasound, Multi Step Modality Fusion Network (MSMFN) was proposed, which mainly contained three innovations. First, a grouping and step-by-step fusion strategy was adopted, which divided the modalities into three groups according to their similarity, and took the lead in intra-group modality fusion followed by inter-group modality fusion. It laid the framework foundation for subsequent association of ultrasound spatio-temporal features. Second, a modality interaction guidance mechanism was proposed, which enabled the network to guide the feature extraction process of Color Doppler Flow Imaging and Ultrasound Elastography based on the knowledge obtained from B-mode Ultrasound through cross-semantic spatial-level feature association. Last but not the least, a modality feature orthogonal self-supervision method was used to guide the network to extract mutually non-overlapping heterogeneous features from different groups of modalities separately. It supervised the spatial relationships between different groups of modality features presenting mutual orthogonality in the same space. This method enhanced the value and utilization efficiency of each ultrasound modality in downstream clinical diagnosis. We conducted method comparison and network ablation experiments for each component on the challenging problem of differentiating pathological subtypes of metastatic Cervical Lymphadenopathy. The results showed that MSMFN significantly outperformed clinicians and other related methods in terms of comprehensive diagnostic ability, especially achieving an accuracy improvement of more than 5-10% compared to other methods in the category more difficult to distinguish. MSMFN provided a new way of thinking and a new algorithm for artificial intelligence-assisted clinical diagnosis for multi-modal ultrasound fusion analysis.

 

In the third part, a clinical application research was carried out. In this part, a hierarchical diagnostic network (CLA-HDN) was designed, based on the most commonly used ultrasound modalities, B-mode Ultrasound and Color Doppler Flow Imaging, to address the classification of four pathological types of Cervical Lymphadenopathy. CLA-HDN decomposed the task using a two-layer diagnostic structure. Each layer was composed of one or two sub-networks with the same structure but being independent of each other, which were used to undertake different diagnostic sub-tasks. Each sub-network adopted a two-branch structure and used a feature importance metric and transfer mechanism with an adaptive inter-modal feature map channel alignment to correlate the feature extraction process of dual-modal ultrasound. The CLA-HDN structure was designed to decouple the feature extraction process in the differentiation of pathological types, creating a more balanced performance among each type and enhancing its interpretability and ancillary clinical diagnostic capability. This research focused on the clinical application value of CLA-HDN and carefully designed a two-stage multi-center multi-year human-machine comparison experiment. Six ultrasound radiologists with different levels of diagnostic experience performed diagnosis independently first and then were allowed to modify the results with the assistance of attention heatmaps and diagnostic conclusions given by CLA-HDN. The experimental results showed that CLA-HDN not only had the ability to diagnose the pathological types of Cervical Lymphadenopathy at a high level of seniority, but also improved and significantly improved the diagnostic level of senior, and middle and low level of ultrasound radiologists in all three centers. CLA-HDN provided a solution for replicable and scalable artificial intelligence-assisted pathological diagnosis of Cervical Lymphadenopathy in less developed and developing countries and regions.

 

The researches conducted in the second part and the third part of this thesis were published in IEEE Transactions on Medical Imaging (SCI Zone 1, IF 11.037, CCF-B journal), a journal in the field of medical image analysis and processing, and BMC Medicine (SCI Zone 1, IF 11.150), a medical journal, with the author of this thesis as first author and co-first author respectively.

关键词多模态超声影像 互作融合 人工智能辅助临床诊断 多步模态融合网络 层次化诊断网络
语种中文
七大方向——子方向分类医学影像处理与分析
国重实验室规划方向分类先进智能应用与转化
是否有论文关联数据集需要存交
文献类型学位论文
条目标识符http://ir.ia.ac.cn/handle/173211/51658
专题毕业生_硕士学位论文
推荐引用方式
GB/T 7714
孟哲令. 多模态超声时空特征互作融合智能分析算法及其辅助临床诊断研究[D],2023.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
答辩后论文.pdf(10969KB)学位论文 限制开放CC BY-NC-SA
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[孟哲令]的文章
百度学术
百度学术中相似的文章
[孟哲令]的文章
必应学术
必应学术中相似的文章
[孟哲令]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。