面向开放环境的鲁棒自适应深度学习方法研究

CASIA OpenIR > 多模态人工智能系统全国重点实验室 > 模式分析与学习

	面向开放环境的鲁棒自适应深度学习方法研究
	杨红明
	2020
页数	152
学位类型	博士
中文摘要	传统模式识别方法基于两大基本假设，即样本的独立同分布假设和类别的封闭世界假设。在真实开放环境中，这样的假设通常难以满足。在开放环境下，数据分布会经常发生变化，未知样本和新类别会不断出现，这给现有模式识别方法带来了巨大挑战。近年来，随着深度学习技术的快速发展，卷积神经网络（Convolutional neural network，CNN）在多个模式识别任务上取得了巨大成功，在某些特定数据集上，CNN 甚至可取得比人类更高的识别精度。然而，CNN 的成功依赖于规范、封闭的识别场景。在开放环境下，CNN 对数据分布的变化仍欠缺自适应性，对未知样本和新类别仍缺乏鲁棒性，这严重限制了CNN 在实际中的应用。本文主要研究开放环境下的深度学习方法，通过设计新的深度自适应方法、鲁棒识别框架和深度特征学习方法，来提高CNN 在开放环境下的识别性能。本文主要贡献包括以下几方面： 1. 提出了一种基于风格迁移映射的深度无监督自适应方法，并将该方法从 CNN 全连接层拓展到了卷积层。该方法通过学习一个线性映射对目标域数据进行变换，以对源域和目标域数据分布进行对齐。其存在封闭形式的解析解，故十分高效，可方便快捷地将基础CNN 分类器自适应到多个目标域上。同时，本文进一步提出在CNN 多个层后进行自适应，来获得更好的自适应效果。在一个大规模的联机中文手写数据集上的实验结果证明了本方法的有效性。 2. 结合手写识别任务，提出了一种风格混合自适应问题和方法。在风格混合自适应设定中，测试数据来自多个领域，且数据的领域信息未知。为解决这一问题，本文设计了一种风格特征来表征样本的领域信息，利用风格特征对测试数据进行聚类，得到的每个聚类子集具有风格一致性。将基础CNN 分类器自适应到每个聚类子集，则可完成整个测试集上自适应过程。相关实验结果证明了该方法在风格混合问题中的有效性。 3. 提出一种卷积原型网络（Convolutional prototype network，CPN）用于开放环境下的鲁棒模式识别。CPN 舍弃了传统CNN 中基于封闭世界假设的softmax 操作，而是采用面向开放环境的原型模型进行判别。通过结合判别损失和生成损失对CPN 进行训练，CPN 可同时具备判别模型和生成模型的优势，使得其在正确分类已知数据同时亦可很好拒识未知数据。相关实验结果表明CPN 在传统闭集分类、未知拒识和小样本学习任务中均具备显著优势。 4. 提出一种通用性特征表示学习方法来处理开放环境下的未知拒识和增类学习问题。采用CPN 作为基础架构，通过对其施加判别监督、生成监督以及基于自编码器的重构监督，来学习通用性更强的特征表示。通用性特征的判别性不仅限于训练类别，更可泛化到未知数据和新类别。基于通用性更强的特征，可更好进行未知拒识和增类学习任务。相关实验结果表明多监督条件下学到的特征有更好的通用性，其在传统闭集分类、未知拒识和增类学习任务中均显示出良好效果。
英文摘要	Traditional pattern recognition methods are based on two main assumptions, i.e., independent and identical distribution (IID) assumption of samples and closed-world assumption of categories. However, in real open environment, such assumptions are often violated. In open environment, the distribution change of data, the incoming of unknown samples and new categories will bring great challenges on existing pattern recognition methods. Recently, with the fast development of deep learning techniques, the convolutional neural network (CNN) has achieved great success in various pattern recognition tasks. Particularly, CNN can even achieve higher recognition accuracies than humans in some specific tasks. However, the success of CNN usually relies on stationary and closed scenarios. In open environment, CNN still can not well adapt the changes of data distribution, and meanwhile, it lacks robustness for dealing with unknown samples and novel categories, which greatly limit its application in real world. This thesis studies deep learning methods toward open environment, by designing novel deep domain adaptation methods, robust recognition frameworks and deep representation learning methods, for improving the recognition performance of CNN in open environment. The main contributions are summarized below: (1) A style transfer mapping (STM) based method is proposed for unsupervised adaptation of neural networks, and it is extended from fully connected layers to convolutional layers of CNN. This method directly learns a linear mapping to project the target domain data for aligning its distribution with source domain. The linear mapping has a closed-form solution, thus the proposed method is very efficient, and can be used for adapting the base CNN classifier to multiple different domains conveniently and rapidly. Moreover, extending adaptation to multiple layers of CNN can further improve the performance. Experimental results on a large scale online handwriting datasets demonstrate the effectiveness of the proposed method. (2) From handwriting recognition task, the style mixture adaptation problem and corresponding solving method are proposed. In style mixture setting, the test data come from multiple target domains and the domain information of the data is unknown. To deal with this problem, a kind of style feature is designed for representing the domain information of the data. By clustering the test data with style feature, each resulted cluster-subset will have better style consistency. Then the base CNN can be adapted to each cluster-subset respectively for completing the adaptation process in the whole test set. Experimental results on online handwriting datasets demonstrate the effectiveness of the proposed method in style mixture situation. (3) A convolutional prototype network (CPN) framework is proposed for robust pattern recognition in open environment. CPN discards the closed-world assumed softmax operation in traditional CNN and adopts the open environment oriented prototype model for discrimination. Particularly, by combining the designed discriminative and generative loss functions to train CPN, CPN essentially will possess advantages from both discriminative and generative models, which is beneficial for both known classification and unknown rejection tasks simultaneously. Experimental results show the significant advantages and potential of CPN in multiple tasks including traditional closed-set classification, unknown rejection and small sample learning. (4) A general feature learning method is proposed to handle the unknown rejection and class incremental learning problems in open environment. Specifically, on base framework of CPN, the discriminative supervision, generative supervision and autoencoder based reconstruction supervision are added simultaneously for learning more general features. For more genera feature, its discriminative ability is not merely limited in known categories, but can further generalize to unknowns and novel categories. With more general features, both unknown rejection and class incremental learning can be better performed. Experimental results demonstrate the generality of learned features under multiple supervision. With more general features, performances on traditional closed-set classification, unknown rejection and class incremental learning tasks are all improved.
关键词	模式识别领域自适应未知拒识增类学习卷积神经网络原型学习
语种	中文
七大方向——子方向分类	机器学习
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/44420
专题	多模态人工智能系统全国重点实验室_模式分析与学习
推荐引用方式 GB/T 7714	杨红明. 面向开放环境的鲁棒自适应深度学习方法研究[D]. 中国科学院自动化研究所. 中国科学院大学,2020.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
Thesis.pdf（1832KB）	学位论文		开放获取	CC BY-NC-SA