英文摘要 | Semantic segmentation of remote sensing images is long-term research in the field of remote sensing image processing, aiming to perform pixel-wise semantic classification of the images. This task has many important applications in remote sensing fields, such as urban planning, environmental monitoring, crop analysis, on-satellite processing and disaster investigation. Recently, with the introduction of deep learning technology, remote sensing image semantic segmentation has achieved rapid development. However, existing deep learning-based methods still face several challenges: (1) existing deep learning models rely on a large number of annotated data for training, and can only achieve better performance on identically distributed test data, and if there exist differences in the distribution of different data, the model trained in the labeled data (source domain) cannot be well generalized to other unlabeled data (target domain); (2) due to the diversity of data acquisition conditions of remote sensing images (\textit{e.g.}, various data acquisition sensors, different data ground sampling distances, different imaging spectral bands and different data acquisition regions), it is time-consuming and laborious to get pixel-wise annotation for each scene. Fortunately, transfer learning can extract knowledge from the source domain to improve the performance of the model in the target domain under the condition that the target domain lacks annotations. Therefore, this dissertation mainly studies the semantic segmentation method of remote sensing image based on transfer learning. The main contributions and innovations of the dissertation are summarized as follows:
-
A domain adaptation method for semantic segmentation of remote sensing images based on domain similarity is proposed. By integrating the domain similarity discriminator (DSD) into an adversarial learning framework, this method can generate domain-invariant features and finally eliminate the domain shift. Specifically, this method firstly points out that the existing adversarial learning-based domain adaptation methods only use the information of the source domain and the target domain independently, and without combine the information between the two domains. To alleviate the influence of domain-irrelevant information on adversarial learning, DSD makes adversarial learning focus on domain-related features by distinguishing domain similarity between input feature pairs, and using domain similarity information of same domain and domain dissimilarity information between different domains. The effectiveness of the proposed method is verified by different cross-domain comparative experiments.
-
A domain adaptation method of remote sensing image semantic segmentation based on global and class-wise alignment is proposed. The core idea of this method is to combine adversarial learning and self-training strategy to align the distribution of different domains globally and class-wise. On the one hand, a triplet adversarial global distribution alignment network is proposed to solve the problem of unstable adversarial learning caused by the domain-irrelevant noise in the feature layer and the classifier is difficult to adapt to the target domain feature. In the output space of the segmentation network, this network explicitly uses the source domain and target domain information to narrow the distribution gap between different domains through triplet adversarial learning. The classifier is optimized by combining the features of the source and target domain. On the other hand, a discriminative confidence self-training strategy is proposed to solve the problem of inter-domain class-wise alignment. In this strategy, the output of the discriminator is used as the basis to measure whether the target data have eliminated the global distribution bias. Then, pseudo-labels are generated for the pixels in the target data with small global distribution bias. Finally, these pseudo-labels are used to align the class-wise distribution of different domains by fine-tuning the segmentation network. Experimental evaluations in multiple cross-domain scenarios demonstrate the effectiveness and generalization of the proposed method.
-
A domain adaptation method of remote sensing image semantic segmentation based on triplet loss adversarial learning and cross consistency constraint is proposed. This method is mainly to solve the problems of poor stability of adversarial learning, the inconsistent difficulty of class transfer between domains, and insufficient use of boundary data in current methods. To deal with the problem of poor stability of adversarial learning, a global alignment network based on triplet loss is proposed. This network integrates the triplet loss into the adversarial learning framework and stabilizes the learning process by optimizing the relative distance between the distribution similarities of different domains. Moreover, this network still performs well in the case of insufficient original data in the target domain. Aiming at the problem of different difficulty of class transfer between domains, an adaptive class-aware pseudo-label selection strategy is proposed. By estimating each category's prior prediction probability in the target domain, this strategy adaptively provides the threshold required for generating pseudo-labels for each category, so as to generate reliable and class balance pseudo-labels. Finally, a cross-mean teacher network is proposed to solve the problem of insufficient utilization of boundary data. The network alleviates the over-coupling problem of the mean teacher network by using cross-consistency constraint, and effectively utilizes the target data with and without pseudo-labels, thereby enhancing the segmentation network's ability to segment object boundaries. Extensive and comprehensive cross-domain experiments verify the effectiveness and necessity of each module in the proposed method.
|
修改评论