基于迁移学习的遥感图像语义分割方法研究

CASIA OpenIR > 多模态人工智能系统全国重点实验室 > 先进时空数据分析与学习

	基于迁移学习的遥感图像语义分割方法研究
其他题名	Semantic Segmentation of Remote Sensing Image Based on Transfer Learning
	晏亮
	2021-05-26
页数	160
学位类型	博士
中文摘要	遥感图像语义分割是遥感图像处理领域中的一个长期研究热点，旨在对遥感图像进行逐像素的语义分类。该任务在城市规划、环境监测、作物分析、星上处理以及灾害勘察等遥感领域具有重要应用。近年来，随着深度学习技术的引入，遥感图像语义分割取得了飞速的发展。然而现有基于深度学习的方法依旧面临着几个挑战：（1）现有深度学习模型依赖大量的标注数据进行训练，且只有在与训练数据同分布的测试数据上才能取得较好的应用性能，并且当不同数据域的数据分布存在差异时，在已有标注的数据域（源域）中训练的模型无法很好的泛化到其他未标注的数据域中（目标域）；（2）由于遥感图像获取数据途径的多样性（如，获取数据传感器不同、数据地面采样距离不同、成像光谱波段不同和数据获取区域不同等），为每个数据域都进行逐像素标注费时费力。而迁移学习能在目标域缺少标注的情况下，从源域数据中提取共性知识以帮助提升模型在目标域中的性能。因此，本文主要研究基于迁移学习的遥感图像语义分割方法，结合遥感图像的特点，重点突破迁移学习在遥感图像语义分割任务中面临的挑战。本文的主要贡献和创新点归纳如下: 提出一种基于域相似性的遥感图像语义分割领域自适应方法。本方法通过将提出的域相似性判别器和分割网络整合到一个对抗学习框架下，利用对抗学习使分割网络生成域不变的特征，从而消除不同数据域分布的差异。具体地，本方法首先指出现有基于对抗学习的领域自适应方法仅单独利用源域和目标域信息，并没有将两者之间的信息进行有效的结合，容易引入域无关噪声，不利于对抗学习的稳定训练。为了缓解域无关信息对对抗学习造成的影响，提出的域相似性判别器通过区分输入特征对之间的领域相似度，同时利用同域数据的域相似信息和不同域数据间的域不相似信息，使对抗学习聚焦于域相关特征。不同的跨域对比实验验证了本方法的有效性。提出一种基于全局和类别对齐的遥感图像语义分割领域自适应方法。本方法的核心思想是结合对抗学习和自训练策略，从全局和类别两个角度对齐不同域数据的分布，从而提升模型的适应性。一方面，针对分割网络中间层特征维度过高，引入太多域无关噪声而造成的对抗学习不稳定以及分类器难以适应目标域特征的问题，提出一种三元对抗的全局分布对齐网络。该网络结合之前的工作，在具有更多抽象特征和更少域无关噪声的分割网络的输出空间中，通过三元对抗学习显式地利用源域和目标域信息在全局上缩减域间分布的差异，并同时使用源域和目标域特征对分类器进行优化。另一方面，针对域间类别对齐问题，提出一种基于判别置信度的自训练策略。该策略利用判别器的输出置信度作为衡量目标域数据是否消除全局分布偏差的依据，为目标域数据中消除了全局分布偏差的像素生成伪标签，并利用伪标签通过监督损失微调分割网络来对齐域间类别的分布。多个难易程度不一的遥感图像跨域语义分割任务中的实验验证了本方法的有效性和泛化性。提出一种基于三元损失对抗和交叉一致性约束的遥感图像语义分割领域自适应方法。本方法主要针对当前方法中对抗学习稳定性差、域间类别迁移难度不一以及边界数据利用不充分等问题。针对对抗学习稳定性差的问题，提出基于三元损失对抗学习的全局对齐网络。该网络将三元损失函数整合到对抗学习框架中，通过优化不同域分布之间的相对距离来稳定对抗训练，在目标域原始数据不足的情况下依旧表现出优异的性能。针对域间类别迁移难度不一的问题，提出一种自适应的类别感知伪标签选取策略。该策略通过估计目标域中每个类别的先验预测概率值，自适应地为每个类别提供生成伪标签时所需的阈值，从而生成可信赖的且类别平衡的伪标签。最后，针对边界数据利用不充分的问题，提出一个交叉均值教师网络。该网络通过交叉一致性约束缓解均值教师网络的过耦合问题，有效地利用目标域中已分配和未分配伪标签的数据，提升网络对边界的分割能力。广泛而全面的跨域实验验证了本方法中每个模块的有效性和必要性。
英文摘要	Semantic segmentation of remote sensing images is long-term research in the field of remote sensing image processing, aiming to perform pixel-wise semantic classification of the images. This task has many important applications in remote sensing fields, such as urban planning, environmental monitoring, crop analysis, on-satellite processing and disaster investigation. Recently, with the introduction of deep learning technology, remote sensing image semantic segmentation has achieved rapid development. However, existing deep learning-based methods still face several challenges: (1) existing deep learning models rely on a large number of annotated data for training, and can only achieve better performance on identically distributed test data, and if there exist differences in the distribution of different data, the model trained in the labeled data (source domain) cannot be well generalized to other unlabeled data (target domain); (2) due to the diversity of data acquisition conditions of remote sensing images (\textit{e.g.}, various data acquisition sensors, different data ground sampling distances, different imaging spectral bands and different data acquisition regions), it is time-consuming and laborious to get pixel-wise annotation for each scene. Fortunately, transfer learning can extract knowledge from the source domain to improve the performance of the model in the target domain under the condition that the target domain lacks annotations. Therefore, this dissertation mainly studies the semantic segmentation method of remote sensing image based on transfer learning. The main contributions and innovations of the dissertation are summarized as follows: A domain adaptation method for semantic segmentation of remote sensing images based on domain similarity is proposed. By integrating the domain similarity discriminator (DSD) into an adversarial learning framework, this method can generate domain-invariant features and finally eliminate the domain shift. Specifically, this method firstly points out that the existing adversarial learning-based domain adaptation methods only use the information of the source domain and the target domain independently, and without combine the information between the two domains. To alleviate the influence of domain-irrelevant information on adversarial learning, DSD makes adversarial learning focus on domain-related features by distinguishing domain similarity between input feature pairs, and using domain similarity information of same domain and domain dissimilarity information between different domains. The effectiveness of the proposed method is verified by different cross-domain comparative experiments. A domain adaptation method of remote sensing image semantic segmentation based on global and class-wise alignment is proposed. The core idea of this method is to combine adversarial learning and self-training strategy to align the distribution of different domains globally and class-wise. On the one hand, a triplet adversarial global distribution alignment network is proposed to solve the problem of unstable adversarial learning caused by the domain-irrelevant noise in the feature layer and the classifier is difficult to adapt to the target domain feature. In the output space of the segmentation network, this network explicitly uses the source domain and target domain information to narrow the distribution gap between different domains through triplet adversarial learning. The classifier is optimized by combining the features of the source and target domain. On the other hand, a discriminative confidence self-training strategy is proposed to solve the problem of inter-domain class-wise alignment. In this strategy, the output of the discriminator is used as the basis to measure whether the target data have eliminated the global distribution bias. Then, pseudo-labels are generated for the pixels in the target data with small global distribution bias. Finally, these pseudo-labels are used to align the class-wise distribution of different domains by fine-tuning the segmentation network. Experimental evaluations in multiple cross-domain scenarios demonstrate the effectiveness and generalization of the proposed method. A domain adaptation method of remote sensing image semantic segmentation based on triplet loss adversarial learning and cross consistency constraint is proposed. This method is mainly to solve the problems of poor stability of adversarial learning, the inconsistent difficulty of class transfer between domains, and insufficient use of boundary data in current methods. To deal with the problem of poor stability of adversarial learning, a global alignment network based on triplet loss is proposed. This network integrates the triplet loss into the adversarial learning framework and stabilizes the learning process by optimizing the relative distance between the distribution similarities of different domains. Moreover, this network still performs well in the case of insufficient original data in the target domain. Aiming at the problem of different difficulty of class transfer between domains, an adaptive class-aware pseudo-label selection strategy is proposed. By estimating each category's prior prediction probability in the target domain, this strategy adaptively provides the threshold required for generating pseudo-labels for each category, so as to generate reliable and class balance pseudo-labels. Finally, a cross-mean teacher network is proposed to solve the problem of insufficient utilization of boundary data. The network alleviates the over-coupling problem of the mean teacher network by using cross-consistency constraint, and effectively utilizes the target data with and without pseudo-labels, thereby enhancing the segmentation network's ability to segment object boundaries. Extensive and comprehensive cross-domain experiments verify the effectiveness and necessity of each module in the proposed method.
关键词	遥感图像处理语义分割迁移学习领域自适应深度学习
学科领域	人工智能
学科门类	工学::计算机科学与技术（可授工学、理学学位）
语种	中文
七大方向——子方向分类	图像视频处理与分析
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/44706
专题	多模态人工智能系统全国重点实验室_先进时空数据分析与学习
通讯作者	晏亮
推荐引用方式 GB/T 7714	晏亮. 基于迁移学习的遥感图像语义分割方法研究[D]. 中国科学院大学自动化研究所. 中国科学院大学,2021.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
晏亮-博士论文-基于迁移学习的遥感图像语（10529KB）	学位论文		开放获取	CC BY-NC-SA