CASIA OpenIR  > 毕业生  > 博士学位论文
基于原型的在线自适应学习方法研究
沈媛媛
Subtype博士
Thesis Advisor刘成林
2019-12
Degree Grantor中国科学院大学
Place of Conferral中国科学院自动化研究所
Degree Discipline模式识别与智能系统
Keyword在线半监督学习 在线主动学习 非遗忘学习 连续自适应 原型学习
Abstract

近年来,随着网络带宽的不断提升和传感、存储设备的持续发展,数据收集
越来越容易,而相应的数据处理和分析技术正面临着前所未有的挑战。如何利用
持续增长的数据资源进行连续自适应的模型学习已成为亟待解决的难题,引发
了学术界和工业界的广泛关注。当身处海量的动态数据中时,人类利用持续获取
的新知识来不断适应新的环境变化,且整个学习过程仅在少量的教师信号指导
下完成。本文试图模拟这一认知过程,对在线半监督学习、在线主动学习及连续
自适应等机器学习技术进行研究和探索,提出了一系列基于原型的在线自适应
学习方法并将其应用于分类任务中。本文的主要工作和创新点归纳如下:
  为了减少数据流中样本标注的负担,提出了一种基于原型的在线半监督
学习方法,称为 OSS-LVQ。在该方法中,我们假设属于同一聚类的样本具有相
同的标签信息(即每个类别由若干个聚类簇组成),将有监督的原型分类模型和
无监督的原型聚类模型纳入统一的学习框架,使得提出的方法既能够利用有标
记样本直接学习分类边界,又能够利用无标记样本进一步提升分类性能。我们在
手写体字符图像、自然场景图像及文本数据集上进行实验,验证了该方法的有效
性。
  为了探索开放环境下的连续学习问题,提出了一种基于原型的半监督类
别增类学习方法,称为 EvolvingProto。该方法首先结合概率原型模型和从粗到精两阶段学习策略自动从无标记样本中发现新类别样本;其次基于原型聚类方法
选择少量代表性样本请求标注,最后记忆有代表性的模式实现类别的增量学习。
实验结果表明,该方法能够有效地从无标记数据流样本中发现新类别样本,且仅
对少量样本进行主动标记请求即可实现类别增量学习。
  针对数据流中样本风格的连续变化问题,提出了一种基于原型的自适应
学习方法,称为 CIALVQ。该方法利用变换矩阵建模样本风格,并根据风格变化
自适应的调整变换矩阵。最终,学习模型将所有样本映射到统一的风格空间进行
分类。为了减少标注代价,我们采用主动学习方式更新分类模型。在手写文字数
据集上的实验表明,该方法能够利用局部风格一致性有效提升手写体样本的分
类性能。同时,当书写人发生变化时,模型能够自动适应风格变化。

Other Abstract

In recent years, with the increasing network bandwidth and the development of sensing and storage devices, data collection is getting cheaper, while data processing and analysis are facing unprecedented challenges. How to utilize the ever-increasing data to adapt recognition models has becoming an important problem, which needs to be solved urgently. Research on online adaptive learning has attracted much attention from both the academia and industry in the last decades. When human beings deal with massive amounts of data, we continue to acquire new knowledge to adapt to environmental changes with little supervision. With attempt to simulate such human cognition mechanism, we study several online paradigms, i.e., online semi-supervised learning,
active learning and continuous adaptation. A number of effective learning methods are proposed and have been successfully applied to classification tasks. The main contributions of the thesis are summarized as follows:
• To reduce the labeling burden in data stream, we propose a novel online semisupervised algorithm based on prototype learning, called OSS-LVQ. Here we assume that data points clustered together are likely to have the same class label (i.e., each class is divided into several clusters). Supervised prototype-based classifier model and unsupervised prototype-based cluster model are learned in a unified framework, where the prototypes are shared by supervised and unsupervised learning. The method can utilize both labeled samples and unlabeled samples in data stream to improve the classification performance. We conduct experiments on handwriting images, natural scene images and text datasets, and verify the effectiveness of the method in respect of classification accuracy and time efficiency.
• To explore the continuous learning problem in an open environment, we propose a semi-supervised class-incremental learning method based on learning vector quantization, called EvolvingProto. This method firstly combines probabilistic prototype model with a from-coarse-to-fine learning stategy to detect novel classes from unlabeled samples in data stream, and then a small amount of samples from novel classes are chosen to be labeled based on prototype clustering, finally the capability to classify novel classes are integrated into the model by memorizing critical samples. The experimental results show that the proposed method has capability to discover novel classes from unlabeled data stream, and realize class-incremental learning with limited labeled samples.
• To tackle the continuous change of sample style in the data stream, we propose an incremental adaptive learning vector quantization, called CILVQ. In this method, unified transfer matrix is used to model style for all samples, and then adjusted according to style changes. Finally, test samples can be classified in the style-free space. To reduce the cost of the labeling process, we exploit the active learning to update classifier model. The experimental results on handwritten character dataset show that the
proposed method can improve the classification performance with the local style consistency. Meanwhile, classifier model automatically adapt to writing style when the writers change.

Pages116
Language中文
Document Type学位论文
Identifierhttp://ir.ia.ac.cn/handle/173211/28348
Collection毕业生_博士学位论文
Recommended Citation
GB/T 7714
沈媛媛. 基于原型的在线自适应学习方法研究[D]. 中国科学院自动化研究所. 中国科学院大学,2019.
Files in This Item:
File Name/Size DocType Version Access License
基于原型的在线自适应学习方法研究.pdf(4464KB)学位论文 暂不开放CC BY-NC-SAApplication Full Text
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[沈媛媛]'s Articles
Baidu academic
Similar articles in Baidu academic
[沈媛媛]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[沈媛媛]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.