室内场景图像的语义分割方法研究

CASIA OpenIR > 毕业生 > 硕士学位论文

	室内场景图像的语义分割方法研究
	陈新泽
	2017-05-10
学位类型	工学硕士
中文摘要	图像语义分割是计算机视觉领域中一个重要的研究主题，其旨在将图像自动分割成若干个含有一定语义信息的区域。准确的图像语义分割是实现诸多计算机视觉任务（如场景理解与分析）的基础。近年来，随着深度神经网络被引入到图像语义分割研究中来，该项研究得到了快速的发展，在智能服务机器人、无人驾驶汽车、医学图像分析等领域中均展现出巨大的应用潜力。然而，对于环境复杂的室内场景，目前文献中基于深度神经网络的图像语义分割方法仍无法得到较为理想的分割结果。本文针对基于深度神经网络的室内场景图像语义分割中存在的一些问题展开探索和研究，主要工作包括： 1. 提出了一种新的基于残差网络的图像语义分割方法。该方法主要由三个模块组成：（i）数据预处理模块：针对目前国际公开数据集中室内场景标注数据不足的问题，在该模块中设计了一种在线的数据扩充方式；（ii）改进的深度残差网络模块：设计了一种融合膨胀卷积（Dilated Convolution）和长短时记忆（Long Short-Term Memory， LSTM）的深度残差网络，以提高该网络对物体边界的定位精度；（iii）难区分像素在线选择模块：设计了一种针对难区分像素的目标损失函数，以加快网络的收敛速度并进一步提高网络的分割精度。 2. 提出了一种融合场景深度信息的图像语义分割方法。该方法首先利用深度神经网络进行图像的粗分割，然后利用全连接条件随机场将图像粗分割结果与场景深度信息进行有效融合，通过求解该全连接条件随机场实现图像的精确分割。 3. 提出了一种基于生成式对抗网络的图像语义分割方法。该方法引入了一种超参数自适应调节机制，对于不同的判别信息和不同的对抗损失函数，都可以较为有效地处理生成式对抗网络训练过程中可能出现的判别器反传梯度过大问题，并进一步提高了图像分割精度。
英文摘要	Image semantic segmentation is an important topic in the field of computer vision, which is to divide an image into several semantic areas automatically. Accurate image semantic segmentation lays a solid foundation of handling many computer vision tasks, such as scene understanding and scene analysis. In recent years, deep convolutional neural networks (DCNNs) have been employed to handle the image semantic segmentation problem, largely accelerating the researches on image semantic segmentation. And CNN-based image semantic segmentation has shown great application potentials on intelligent service robots, self-driving cars, medical image analysis, etc. However, it is still hard for existing CNN-based semantic segmentation methods to segment complex indoor scenes accurately. This thesis is focused on CNN-based semantic segmentation for indoor scenes, and the main contributions are summarized as: 1. A novel image semantic segmentation method based on the deep residual network is proposed, consisting of three modules: (i) Data preprocessing module: Addressing the lack of labeled indoor images in the open-source image datasets for segmentation, a new online data augmentation approach is designed in this module; (ii) Improved deep residual network module: A new deep residual network is explored, which combines dilated convolution and LSTM (Long Short-Term Memory) to improve its boundary localization accuracy; (iii) Online selection module for indistinguishable pixels: A novel target loss function on indistinguishable pixels is designed to speed up the convergence of the network training procedure, and to further improve the segmentation accuracy of the proposed network. 2. An image semantic segmentation method by fusing the scene depth information is proposed. The proposed method first uses a DCNN to segment images coarsely, then fuses the obtained coarse segmentation results with the scene depth information via a fully connected conditional random field, and refines the coarse segmentation results by solving this fully connected conditional random field. 3. An image semantic segmentation method based on the generative adversarial network is proposed. In the proposed method, an adaptive adjustment mechanism is introduced for setting the super parameter automatically. For different discrimination information and different adversarial loss functions, this mechanism can effectively deal with the problem that the back-propagation gradients for the discriminator often increase sharply when training a generative adversarial network, and it can further improve the segmentation accuracy of the proposed method.
关键词	图像语义分割深度学习卷积神经网络条件随机场生成式对抗网络
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/14700
专题	毕业生_硕士学位论文
作者单位	中国科学院自动化研究所
第一作者单位	中国科学院自动化研究所
推荐引用方式 GB/T 7714	陈新泽. 室内场景图像的语义分割方法研究[D]. 北京. 中国科学院研究生院,2017.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
室内场景图像的语义分割方法研究.pdf（16027KB）	学位论文		限制开放	CC BY-NC-SA