基于原型的在线自适应学习方法研究

CASIA OpenIR > 毕业生 > 博士学位论文

	基于原型的在线自适应学习方法研究
	沈媛媛
	2019-12
页数	116
学位类型	博士
中文摘要	近年来，随着网络带宽的不断提升和传感、存储设备的持续发展，数据收集越来越容易，而相应的数据处理和分析技术正面临着前所未有的挑战。如何利用持续增长的数据资源进行连续自适应的模型学习已成为亟待解决的难题，引发了学术界和工业界的广泛关注。当身处海量的动态数据中时，人类利用持续获取的新知识来不断适应新的环境变化，且整个学习过程仅在少量的教师信号指导下完成。本文试图模拟这一认知过程，对在线半监督学习、在线主动学习及连续自适应等机器学习技术进行研究和探索，提出了一系列基于原型的在线自适应学习方法并将其应用于分类任务中。本文的主要工作和创新点归纳如下：为了减少数据流中样本标注的负担，提出了一种基于原型的在线半监督学习方法，称为 OSS-LVQ。在该方法中，我们假设属于同一聚类的样本具有相同的标签信息（即每个类别由若干个聚类簇组成），将有监督的原型分类模型和无监督的原型聚类模型纳入统一的学习框架，使得提出的方法既能够利用有标记样本直接学习分类边界，又能够利用无标记样本进一步提升分类性能。我们在手写体字符图像、自然场景图像及文本数据集上进行实验，验证了该方法的有效性。为了探索开放环境下的连续学习问题，提出了一种基于原型的半监督类别增类学习方法，称为 EvolvingProto。该方法首先结合概率原型模型和从粗到精两阶段学习策略自动从无标记样本中发现新类别样本；其次基于原型聚类方法选择少量代表性样本请求标注，最后记忆有代表性的模式实现类别的增量学习。实验结果表明，该方法能够有效地从无标记数据流样本中发现新类别样本，且仅对少量样本进行主动标记请求即可实现类别增量学习。针对数据流中样本风格的连续变化问题，提出了一种基于原型的自适应学习方法，称为 CIALVQ。该方法利用变换矩阵建模样本风格，并根据风格变化自适应的调整变换矩阵。最终，学习模型将所有样本映射到统一的风格空间进行分类。为了减少标注代价，我们采用主动学习方式更新分类模型。在手写文字数据集上的实验表明，该方法能够利用局部风格一致性有效提升手写体样本的分类性能。同时，当书写人发生变化时，模型能够自动适应风格变化。
英文摘要	In recent years, with the increasing network bandwidth and the development of sensing and storage devices, data collection is getting cheaper, while data processing and analysis are facing unprecedented challenges. How to utilize the ever-increasing data to adapt recognition models has becoming an important problem, which needs to be solved urgently. Research on online adaptive learning has attracted much attention from both the academia and industry in the last decades. When human beings deal with massive amounts of data, we continue to acquire new knowledge to adapt to environmental changes with little supervision. With attempt to simulate such human cognition mechanism, we study several online paradigms, i.e., online semi-supervised learning, active learning and continuous adaptation. A number of effective learning methods are proposed and have been successfully applied to classification tasks. The main contributions of the thesis are summarized as follows: • To reduce the labeling burden in data stream, we propose a novel online semisupervised algorithm based on prototype learning, called OSS-LVQ. Here we assume that data points clustered together are likely to have the same class label (i.e., each class is divided into several clusters). Supervised prototype-based classifier model and unsupervised prototype-based cluster model are learned in a unified framework, where the prototypes are shared by supervised and unsupervised learning. The method can utilize both labeled samples and unlabeled samples in data stream to improve the classification performance. We conduct experiments on handwriting images, natural scene images and text datasets, and verify the effectiveness of the method in respect of classification accuracy and time efficiency. • To explore the continuous learning problem in an open environment, we propose a semi-supervised class-incremental learning method based on learning vector quantization, called EvolvingProto. This method firstly combines probabilistic prototype model with a from-coarse-to-fine learning stategy to detect novel classes from unlabeled samples in data stream, and then a small amount of samples from novel classes are chosen to be labeled based on prototype clustering, finally the capability to classify novel classes are integrated into the model by memorizing critical samples. The experimental results show that the proposed method has capability to discover novel classes from unlabeled data stream, and realize class-incremental learning with limited labeled samples. • To tackle the continuous change of sample style in the data stream, we propose an incremental adaptive learning vector quantization, called CILVQ. In this method, unified transfer matrix is used to model style for all samples, and then adjusted according to style changes. Finally, test samples can be classified in the style-free space. To reduce the cost of the labeling process, we exploit the active learning to update classifier model. The experimental results on handwritten character dataset show that the proposed method can improve the classification performance with the local style consistency. Meanwhile, classifier model automatically adapt to writing style when the writers change.
关键词	在线半监督学习在线主动学习非遗忘学习连续自适应原型学习
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/28348
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	沈媛媛. 基于原型的在线自适应学习方法研究[D]. 中国科学院自动化研究所. 中国科学院大学,2019.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
基于原型的在线自适应学习方法研究.pdf（4464KB）	学位论文		限制开放	CC BY-NC-SA