CASIA OpenIR  > 毕业生  > 博士学位论文
面向开放环境的无监督域适应研究
马昕宏
2022-05-20
Pages156
Subtype博士
Abstract

      随着社会的快速发展,工业生产、安防监控、金融、互联网等服务于国计民生的行业每时每刻都在产生大量数据,机器学习模型已经成为数据智能分析的主力工具之一。尽管机器学习模型的精度和稳定性有了很大的提高,但当应用到显著区别于训练环境的新应用场景时,模型容易出错。机器学习模型难以适应新应用环境的现象普遍存在,究其原因可归结于:(1)在新应用场景中难以获取大规模高质量标注数据来更新模型参数;(2)传统机器学习算法要求训练和测试数据必须服从独立同分布假设。域适应便是解决机器学习模型在新应用场景下学习问题的技术之一。在新应用场景中,数据通常是没有任何标注信息的,应用场景相比于原训练环境更加开放,比如:数据具有多模态特性,包含大量未知类别,存在数据缺失等其他不确定因素。因此研究面向开放环境的无监督域适应,更符合实际应用场景需求,有重要的理论和应用研究价值。

      设计一个鲁棒的面向开放环境的无监督域适应模型需要考虑四个方面因素的影响:(1)数据的多样性。数据类型丰富多样,包括图像,文本等;不同领域数据由于光照、质量、风格等方面的不同,存在显著的领域差异。(2)领域适配。如何进行领域适配是无监督域适应的核心挑战,需要避免过拟合、欠拟合、欠适配和负迁移问题。(3)模型决策缺乏可解释性。开放环境不确定因素多,模型需要产生足够信服力的解释,保证决策安全可控。(4)开放环境复杂多样。算法需要应对开放环境存在大量未知类别以及训练数据无法完整获取的情况。本文针对上述挑战对面向开放环境的无监督域适应展开了研究,首先研究了标准无监督域适应和可解释无监督域适应方法,学习可迁移的特征表示,进一步探索了类别开放的无监督域适应和未知域多模态域适应方法,实现可迁移特征向开放环境的泛化。

      论文的主要工作和创新点归纳如下:

      1.基于图卷积对抗网络的标准无监督域适应方法。为了关联源域和目标域,数据结构、领域标签和类别标签三种信息至关重要。大多数标准无监督域适应方法只利用其中的一种或两种信息,无法实现不同种类信息的相互补充和促进。为了能在统一的深度网络中同时建模数据结构、领域标签和类别标签信息,本文提出了一种图卷积对抗网络,该模型设计了结构自适应对齐、域对齐和类别中心对齐三种对齐机制,可以有效学习领域共享的特征表示,减少领域差异。

      2.基于空间金字塔原型网络的可解释无监督域适应方法。尽管深度无监督域适应方法具有优越的性能,但本身是“黑盒”模型,无法提供很好的解释信息。本文提出了基于空间金字塔原型网络的可解释无监督域适应方法,该模型学习了一组领域共享的语义概念原型向量,实现可解释地分类和可视化模型决策过程。本文进一步提出了融合原型向量、预测结果和分类置信度三种信息的自预测一致性伪标签策略,该策略可以给目标域样本标注伪标签,辅助学习可迁移的原型向量。实验结果展示了方法的优越性能和可解释能力。

      3.基于主动通用适应网络的类别开放无监督域适应方法。大多数无监督域适应方法的设计依赖类别集合的先验知识,无法识别未知类样本的具体类别。针对上述局限性,本文设计了一种主动通用适应网络,该模型不需任何类别集合假设条件,便可识别目标域所有样本的具体类别。该模型首先设计了一个课程学习框架,筛选合适样本对齐源域和目标域,降低知识迁移的过程中负迁移和过拟合风险。为了学习推断未知类别,本文进一步设计了融合样本的迁移性、不确定性和多样性信息的主动学习策略来获取目标域未知类样本的标注信息。通过课程学习框架和主动学习策略的联合学习,模型实现了对目标域所有类别样本的识别。

      4.基于多模态用户泛化元学习的未知域多模态域适应方法。在实际开放环境应用中,模型可能无法获取目标域的数据参与训练,比如:上线检索模型需要给大量未知用户提供服务,但获取所有用户数据几乎不可能。本文以跨模态检索应用为背景,定义了用户域、用户域差异概念和未知域多模态域适应问题。为了解决该问题,本文提出了多模态用户泛化元学习模型。该模型首先设计了一个注意力机制模块来对参与相似性度量的多模态特征进行选择。为了能在多模态特征表示中编码不同用户域的可迁移知识,本文提出一种用户自适应元学习的优化方法,该方法通过模拟用户域差异来自适应聚合梯度信息,学习对用户域差异鲁棒的参数,提升了对未知用户域的泛化性。

 

Other Abstract

With the rapid development of society, many practical applications are generating huge amounts of data all the time. Machine learning has become one of the main tools for intelligent data analysis. Despite significant improvements in performance and stability, machine learning models are prone to errors when applied to new application scenarios that are different from the training environment. Commonly, machine learning models cannot adapt to new application environments. Reasons are as follows: (1) It is difficult to collect large-scale high-quality labeled data in new application scenarios for updating model parameters. (2) Traditional machine learning algorithms assume that training and test data are independently identically distributed. Domain adaptation is one of the techniques addressing the learning problems in new application scenarios, which usually do not have any annotation information and are more open than the original training environment. For example, multimodal data, numerous unknown categories, missing data, and some other uncertain situations may exist in new application scenarios. Therefore, studying unsupervised domain adaptation in the open environment satisfies the requirements of practical applications and is valuable in theoretical and applied research.

To design robust unsupervised domain adaptation models in the open environment, The influences of four factors should be taken into consideration: (1) Diversity of data. The data types are rich and diverse, including images, texts, etc. Besides, data from different domains vary in illumination, quality, style, etc, which results in a significant domain gap. (2) Domain alignment. The core challenge of unsupervised domain adaptation is how to align source and target domains, avoiding overfitting, under-fitting, under-adaptation, and negative transfer problems. (3) Lack of interpretability in model inference. The open environment contains many uncertain factors, so algorithms should provide convincing explanations, guaranteeing model predictions are safe and controllable. (4) Complex and diverse open environment. Algorithms need to deal with uncertain situations in the open environment where there are lots of unknown categories and complete data is not accessible for training. To address the above challenges, we study unsupervised domain adaptation methods in the open environment. Specifically, to learn transferable representations, we study standard unsupervised domain adaptation and interpretable unsupervised domain adaptation methods. Furthermore, we explore open-set unsupervised domain adaptation and unknown-domain multimodal domain adaptation to generalize transferable representations to the open environment.

The major contributions of this dissertation are summarized as follows:

1. Graph convolutional adversarial network for standard unsupervised domain adaptation. To bridge source and target domains for domain adaptation, there are three important types of information including data structure, domain label, and class label. Most existing domain adaptation approaches exploit only one or two types of the above information and cannot make them complement and enhance each other. To jointly model data structure, domain label, and class label in a unified deep model, we propose a graph convolutional adversarial network. The proposed model has designed three effective alignment mechanisms including structure-aware alignment, domain alignment, and class centroid alignment, which can learn domain-invariant representations effectively to reduce the domain gap.

2. Spatial pyramid prototypical network for interpretable unsupervised domain adaptation. Despite the superior performance, deep unsupervised domain adaptation methods are inherent ``black-box'' models which cannot provide explanatory information. To design an interpretable unsupervised domain adaptation method, we propose a spatial pyramid prototypical network to learn a set of domain-shared semantic prototypes that can be utilized to classify images interpretably and visualize the model inference process. Furthermore, we propose a self-predictive consistency pseudo-label strategy, which fuses the information of prototypes, predictions, and classification confidences. This strategy aims to annotate target samples with pseudo labels, which benefits learning more transferable prototypes. The experimental results demonstrate the superior performance and interpretability of the proposed method.

3. Active universal adaptation network for open-set unsupervised domain adaptation. Most unsupervised domain adaptation methods rely on the prior knowledge of label set, and few methods can identify specific categories of  ``unknown'' samples. To address the above limitations, we propose an active universal adaptation network, which removes all label set assumptions and aims to infer specific categories for all target samples. We first introduce a curriculum learning framework, which selects suitable samples to align source and target domains to reduce the risk of negative transfer and overfitting during knowledge transfer. Furthermore, an active learning strategy, which utilizes the clues of transferability, uncertainty, and diversity, is proposed to annotate target informative unknown samples, making it possible to infer specific categories for unknown samples. By jointly training the proposed curriculum learning framework and active learning strategy, the model can infer categories for all target samples.

4. Meta learning multimodal user generalization for unknown-domain multimodal domain adaptation. Since practical applications belong to the open environment, target data may be unavailable for model training. For example, the online retrieval model needs to provide services for numerous unknown users, whose data is impossible to collect. Based on the cross-modal retrieval application, we define the concepts of user domain and user domain shift, and the task of unknown-domain multimodal unsupervised domain adaptation. To address this problem, we first design an attention mechanism module to conduct feature selection when measuring multimodal similarity. Furthermore, to encode transferable knowledge among different user domains into multimodal features, a user-adaptive meta-optimization method is proposed to adaptively aggregate gradient information by simulating user domain shift. Thanks to the user-adaptive meta-optimization, the model could learn parameters that are robust to user domain shift, and enhance the generalization to unknown user domains.

 

Keyword无监督域适应 图卷积网络 可解释深度学习 主动学习 元学习
Subject Area人工智能 ; 人工智能理论 ; 模式识别
MOST Discipline Catalogue工学 ; 工学::计算机科学与技术(可授工学、理学学位)
Language中文
Document Type学位论文
Identifierhttp://ir.ia.ac.cn/handle/173211/48514
Collection毕业生_博士学位论文
Recommended Citation
GB/T 7714
马昕宏. 面向开放环境的无监督域适应研究[D]. 中国科学院自动化研究所. 中国科学院自动化研究所,2022.
Files in This Item:
File Name/Size DocType Version Access License
博士学位论文-马昕宏-面向开放环境的无监(14125KB)学位论文 限制开放CC BY-NC-SA
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[马昕宏]'s Articles
Baidu academic
Similar articles in Baidu academic
[马昕宏]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[马昕宏]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.