面向智慧医疗的人体参数化建模技术

CASIA OpenIR > 毕业生 > 硕士学位论文

	面向智慧医疗的人体参数化建模技术
	周仝希
	2024-05
页数	62
学位类型	硕士
中文摘要	随着互联网、大数据、云计算、虚拟现实、人工智能等前沿科技的快速发展，信息技术深入赋能传统产业。智慧医疗作为人工智能与医学领域的融合产物，在医学影像、辅助诊断、药物研发、健康管理和疾病预测等方向具有广阔的发展前景。人体参数化建模技术是计算机视觉领域的热点研究课题，在医疗场景中，可以根据人体体外图像，重建医生或患者的数字人模型，以实现健康监测、医学教学、定位导航及手术监控等流程的智慧化。然而，医疗场景的环境通常十分复杂，存在着诸如人体被覆盖或遮挡、光照条件较弱等特点，为从图像中估计人体的姿态与形状信息带来了挑战。而当前主流的参数化建模算法主要以单模态RGB图像为输入，构建基于自然场景图像特点的编码及解码网络，重建仅包含人体皮肤的参数化模型，难以直接应用于智慧医疗场景。因此，本文旨在探索如何设计有效的人体参数化建模网络，以对患者和医生进行姿态及形状估计，在理论和实践层面对面向智慧医疗场景的数据集采集与标注、网络结构设计与实验、以及算法部署与应用进行研究。论文的主要工作和创新点归纳如下：首先，构建了面向智慧医疗场景的人体参数化建模数据集。为解决医疗数据集匮乏的问题，采用了公开数据集和自主采集数据等方式，构造了适用于三种不同医疗流程的数据集：面向患者的健康监测数据集、面向患者的定位导航数据集以及面向医生的手术室监控数据集。根据输入模态将数据集划分为RGB数据集和RGB-D数据集，并设计了深度学习与优化相结合的自动标注算法，提高了数据标注的效率和准确性，为网络训练提供了数据支持。其次，提出了面向智慧医疗场景的人体参数化建模深度学习网络。在编码器部分，设计了基于软注意力机制的多模态融合结构，使网络能够自适应分配RGB模态和深度模态的特征权重，并根据图像的遮挡和光照特点有效融合RGB-D特征，提升了网络在极端条件的鲁棒性。在解码器部分，设计了基于解析逆运动学的姿态解码结构，将姿态参数的求解划分为人体3D关键点检测和逆运动学解算两阶段算法，提高了姿态参数估计的准确性。在线性蒙皮层部分，将SMPL皮肤模型蒙皮层重新建模为皮肤-骨骼-动脉模型蒙皮层，使得输入姿态参数和形状参数能够同时重建人体外部皮肤模型和内部骨骼、动脉结构模型，以获得更直观且详细的人体信息。最后，设计了面向智慧医疗场景的人体参数化建模算法部署流程。将模型部署于基于深度相机的视觉系统中，并应用于全自动超声扫查机器人视觉系统和智慧医院监控系统。在全自动超声扫查机器人视觉系统中，利用参数化模型实现了对不同体型姿态患者器官的自动化定位，减轻了医生的负担。在智慧医院监控系统中，利用参数化模型实现了对病房中患者及手术室中医生动作的实时监控，提升了医院管理的效率。实验结果表明，与其他主流算法对比，本文提出的算法在公开数据集上取得了有竞争力的效果，并对遮挡、覆盖、光照弱等极端条件有较高的鲁棒性。
英文摘要	With the rapid development of cutting-edge technologies such as the Internet, big data, cloud computing, virtual reality, and artificial intelligence, information technology is deeply empowering traditional industries. Intelligent healthcare, as an integration of artificial intelligence and medical fields, holds broad prospects in directions such as medical imaging, assisted diagnosis, drug development, health management, and disease prediction. Human parametric modelling technology, a hotspot research topic in the field of computer vision, can reconstruct digital human models of doctors or patients based on external images, enabling intelligent processes such as health monitoring, medical education, localization navigation, and surgical monitoring. However, medical scenarios are often highly complex, characterized by challenges such as body occlusion or coverage and weak lighting conditions, making it difficult to estimate body pose and shape information from images. Current mainstream parametric modelling algorithms primarily use single-modal RGB images as input, constructing encoding and decoding networks based on characteristics of natural scene images to reconstruct parametric models containing only human skin. These models are challenging to directly apply in intelligent healthcare scenarios. Therefore, this paper aims to explore how to design effective human parametric modelling networks for pose and shape estimation of patients and doctors, conducting research on data collection and annotation, network structure design and experiments, as well as algorithm deployment and application tailored to intelligent healthcare scenarios, both theoretically and practically. The main contributions and innovations of this paper are summarized as follows: Firstly, a dataset for human parametric modelling in intelligent healthcare scenarios is constructed. To address the scarcity of medical datasets, a combination of publicly available datasets and autonomously collected data is utilized to create datasets suitable for three different medical processes: a health monitoring dataset for patients, a navigation dataset for patients, and an operation room monitoring dataset for doctors. The datasets are divided into RGB datasets and RGB-D datasets based on input modality. Additionally, an automatic annotation algorithm combining deep learning and optimization is designed to improve the efficiency and accuracy of data annotation, providing data support for network training. Secondly, a deep learning network for human parametric modelling in intelligent healthcare scenarios is proposed. In the encoder module, a multi-modal fusion structure based on soft attention mechanism is designed to allow the network to adaptively allocate feature weights between RGB and depth modalities, effectively integrating RGB-D features based on image occlusion and lighting characteristics, thus enhancing the network's robustness under extreme conditions. In the decoder module, a pose decoding structure based on analytic inverse kinematics is designed, dividing the solution of pose parameters into two-stage algorithms: human 3D keypoint detection and analytical inverse kinematics solving, improving the accuracy of pose parameter estimation. In the skinning layer module, the SMPL skin model's skinning layer is remodeled into a skin-bone-artery model skinning layer, enabling the simultaneous reconstruction of external skin models and internal bone and artery structure models based on input pose and shape parameters to obtain more intuitive and detailed human information. Finally, an algorithm deployment process for human parametric modelling tailored to intelligent healthcare is designed. The model is deployed in visual systems based on depth cameras and applied to fully automatic ultrasound scanning robot visual systems and intelligent hospital monitoring systems. In the fully automatic ultrasound scanning robot visual system, the parametric model is used to achieve automation in organ localization for patients of different body shapes and poses, alleviating the burden on doctors. In the intelligent hospital monitoring system, the parametric model is utilized for real-time monitoring of patient activities in hospital wards and doctor movements in operating rooms, enhancing hospital management efficiency. Experimental results demonstrate that the algorithm proposed in this paper achieves competitive performance on public datasets compared to other mainstream algorithms and exhibits high robustness under extreme conditions such as occlusion, coverage, and weak lighting.
关键词	人体参数化建模多模态融合逆运动学智慧医疗
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/57619
专题	毕业生_硕士学位论文
推荐引用方式 GB/T 7714	周仝希. 面向智慧医疗的人体参数化建模技术[D],2024.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
周仝希-毕业论文打印版.pdf（51449KB）	学位论文		限制开放	CC BY-NC-SA