CASIA OpenIR  > 毕业生  > 博士学位论文










With the rapid development of society, many practical applications are generating huge amounts of data all the time. Machine learning has become one of the main tools for intelligent data analysis. Despite significant improvements in performance and stability, machine learning models are prone to errors when applied to new application scenarios that are different from the training environment. Commonly, machine learning models cannot adapt to new application environments. Reasons are as follows: (1) It is difficult to collect large-scale high-quality labeled data in new application scenarios for updating model parameters. (2) Traditional machine learning algorithms assume that training and test data are independently identically distributed. Domain adaptation is one of the techniques addressing the learning problems in new application scenarios, which usually do not have any annotation information and are more open than the original training environment. For example, multimodal data, numerous unknown categories, missing data, and some other uncertain situations may exist in new application scenarios. Therefore, studying unsupervised domain adaptation in the open environment satisfies the requirements of practical applications and is valuable in theoretical and applied research.

To design robust unsupervised domain adaptation models in the open environment, The influences of four factors should be taken into consideration: (1) Diversity of data. The data types are rich and diverse, including images, texts, etc. Besides, data from different domains vary in illumination, quality, style, etc, which results in a significant domain gap. (2) Domain alignment. The core challenge of unsupervised domain adaptation is how to align source and target domains, avoiding overfitting, under-fitting, under-adaptation, and negative transfer problems. (3) Lack of interpretability in model inference. The open environment contains many uncertain factors, so algorithms should provide convincing explanations, guaranteeing model predictions are safe and controllable. (4) Complex and diverse open environment. Algorithms need to deal with uncertain situations in the open environment where there are lots of unknown categories and complete data is not accessible for training. To address the above challenges, we study unsupervised domain adaptation methods in the open environment. Specifically, to learn transferable representations, we study standard unsupervised domain adaptation and interpretable unsupervised domain adaptation methods. Furthermore, we explore open-set unsupervised domain adaptation and unknown-domain multimodal domain adaptation to generalize transferable representations to the open environment.

The major contributions of this dissertation are summarized as follows:

1. Graph convolutional adversarial network for standard unsupervised domain adaptation. To bridge source and target domains for domain adaptation, there are three important types of information including data structure, domain label, and class label. Most existing domain adaptation approaches exploit only one or two types of the above information and cannot make them complement and enhance each other. To jointly model data structure, domain label, and class label in a unified deep model, we propose a graph convolutional adversarial network. The proposed model has designed three effective alignment mechanisms including structure-aware alignment, domain alignment, and class centroid alignment, which can learn domain-invariant representations effectively to reduce the domain gap.

2. Spatial pyramid prototypical network for interpretable unsupervised domain adaptation. Despite the superior performance, deep unsupervised domain adaptation methods are inherent ``black-box'' models which cannot provide explanatory information. To design an interpretable unsupervised domain adaptation method, we propose a spatial pyramid prototypical network to learn a set of domain-shared semantic prototypes that can be utilized to classify images interpretably and visualize the model inference process. Furthermore, we propose a self-predictive consistency pseudo-label strategy, which fuses the information of prototypes, predictions, and classification confidences. This strategy aims to annotate target samples with pseudo labels, which benefits learning more transferable prototypes. The experimental results demonstrate the superior performance and interpretability of the proposed method.

3. Active universal adaptation network for open-set unsupervised domain adaptation. Most unsupervised domain adaptation methods rely on the prior knowledge of label set, and few methods can identify specific categories of  ``unknown'' samples. To address the above limitations, we propose an active universal adaptation network, which removes all label set assumptions and aims to infer specific categories for all target samples. We first introduce a curriculum learning framework, which selects suitable samples to align source and target domains to reduce the risk of negative transfer and overfitting during knowledge transfer. Furthermore, an active learning strategy, which utilizes the clues of transferability, uncertainty, and diversity, is proposed to annotate target informative unknown samples, making it possible to infer specific categories for unknown samples. By jointly training the proposed curriculum learning framework and active learning strategy, the model can infer categories for all target samples.

4. Meta learning multimodal user generalization for unknown-domain multimodal domain adaptation. Since practical applications belong to the open environment, target data may be unavailable for model training. For example, the online retrieval model needs to provide services for numerous unknown users, whose data is impossible to collect. Based on the cross-modal retrieval application, we define the concepts of user domain and user domain shift, and the task of unknown-domain multimodal unsupervised domain adaptation. To address this problem, we first design an attention mechanism module to conduct feature selection when measuring multimodal similarity. Furthermore, to encode transferable knowledge among different user domains into multimodal features, a user-adaptive meta-optimization method is proposed to adaptively aggregate gradient information by simulating user domain shift. Thanks to the user-adaptive meta-optimization, the model could learn parameters that are robust to user domain shift, and enhance the generalization to unknown user domains.


关键词无监督域适应 图卷积网络 可解释深度学习 主动学习 元学习
学科领域人工智能 ; 人工智能理论 ; 模式识别
学科门类工学 ; 工学::计算机科学与技术(可授工学、理学学位)
GB/T 7714
马昕宏. 面向开放环境的无监督域适应研究[D]. 中国科学院自动化研究所. 中国科学院自动化研究所,2022.
文件名称/大小 文献类型 版本类型 开放类型 使用许可
博士学位论文-马昕宏-面向开放环境的无监(14125KB)学位论文 限制开放CC BY-NC-SA
所有评论 (0)
