CASIA OpenIR  > 毕业生  > 硕士学位论文
小样本学习特征泛化性研究
朱晓萌
2024-05-11
页数64
学位类型硕士
中文摘要

近年来,基于深度学习的图像分类技术在医学、农业、工业等领域取得了突破性成果。然而,深度学习往往依赖丰富的标注数据和充足的算力资源,这限制了图像分类技术在实际应用中的进一步发展。因此,如何解决样本稀疏情况下的图像分类问题逐渐成为计算机视觉领域的研究热点。为了克服这项挑战,小样本学习方法应运而生。这些方法旨在从大量的标注样本中学习到一个具有良好泛化能力的基础模型,从而在极少量新类别标注样本中快速泛化,取得良好的识别能力。基于这一背景,本文对小样本学习领域的主流方法——元学习方法进行了深入研究,希望能够进一步地改善基础模型的泛化能力。

此外,在一些实际应用场景中,模型不仅要关注新类别上少量样本的分类性能,同时还要保留对大量基类标注样本的分类能力。例如,在医学图像分析领域,模型必须要维持对于已知疾病的诊断性能,否则将面临极大的风险。因此,研究者们将小样本学习进一步拓展为广义小样本学习问题,希望模型在训练过程中能够保持在所有类别上的分类性能。这项任务极具挑战性,对模型的泛化能力提出了更高的要求。为了应对这一挑战,本文针对广义小样本学习的基础模型训练展开了深入的研究,希望能够更进一步改善其泛化能力。具体的研究内容如下:

(1)小样本学习基础模型的特征泛化性研究。本文针对元学习方法中遇到的鲁棒性较差和训练过程不稳定等问题展开了深入研究,提出了一种基于动量更新的集群元学习方法。首先,针对元学习任务在遇到低质量样本时鲁棒性能较差的问题,本文受联邦学习的启发,提出了一个组元学习模块。该模块通过对一组多个局部模型进行训练,并通过聚合局部模型参数获得全局模型,以此来削弱低质量样本对于模型的影响。其次,针对训练过程中存在的模型性能不稳定的问题,本文提出了一个自适应动量学习模块来动态调整模型更新策略。最后,本文进行了大量的验证实验,证明了所提出的两个模块能够有效改善基础模型的泛化性和稳定性,进而提升模型性能。

(2)广义小样本学习基础模型的特征泛化性研究。本文针对基础模型训练中因特征分布不均衡导致的模型泛化能力较差问题展开了深入研究,并提出了一种基于关系特征的自蒸馏方法。首先,本文设计了两个启发性实验来探索模型的泛化能力和特征分布特性之间的关系,并深入探讨了改善特征分布均衡性的方法。其次,本文设计了一个自蒸馏训练框架。该框架作用于预训练阶段,通过对随机初始化模型进行自蒸馏训练来改善模型的特征分布特性。此外,本文还提出了一个关系特征构建模块,通过将高维的图像特征转化为低维的关系特征进行蒸馏,有效改善了自蒸馏训练框架存在的限制模型学习能力问题。最后,本文进行了大量验证实验,证明了所提出的训练框架能够有效改善基础模型的特征分布情况,进而提升模型的泛化性能和准确率。

英文摘要

In recent years, image classification technology based on deep learning has achieved groundbreaking results in fields such as medicine, agriculture, and industry. However, deep learning often relies on abundant annotated data and sufficient computational resources, which limits the further development of image classification technology in practical applications. Therefore, addressing the issue of sparse samples in image classification has gradually become a hot research topic in the field of computer vision. To overcome this challenge, few-shot learning methods have emerged. These methods aim to learn from a large number of annotated samples to develop a base model with good generalization ability, which can then quickly generalize to new classes with very few annotated samples, achieving good recognition performance. Based on this background, this article conducts an in-depth study of the mainstream methods in the field of few-shot learning—meta-learning methods, hoping to further improve the generalization ability of the base model.

Furthermore, in some practical application scenarios, the model is concerned not only with the classification performance in new classes with few samples but also with retaining the classification ability in the classes with a large number of annotated samples. For example, in the field of medical image analysis, the model must maintain its diagnostic performance for known diseases; otherwise, it would face significant risks. Consequently, researchers have further expanded the problem of few-shot learning into a generalized few-shot learning problem, hoping the model can maintain classification performance across all classes during training. This task is highly challenging and places higher demands on the model's generalization ability. To address this challenge, this article delves into the training of base models in generalized few-shot learning, aiming to further improve their generalization ability. The specific research content is as follows:

(1) \textbf{Research on the Feature Generalization of Few-Shot Learning Base Models.} This paper conducts an in-depth study on issues such as poor robustness and instability during the training process encountered in meta-learning methods, and proposes a momentum group meta-learning method. First, addressing the issue of poor robustness when meta-learning tasks encounter low-quality samples, this paper inspired by federated learning, introduces a group meta-learning module. This module trains a set of multiple local models and aggregates the parameters of the local models to obtain a global model, thereby mitigating the impact of low-quality samples on the model. Secondly, regarding the problem of unstable model performance during the training process, this paper proposes an adaptive momentum smoothing module module to dynamically adjust the model updating strategy. Lastly, the paper conducts a series of validation experiments to demonstrate that the proposed modules can effectively improve the generalization ability and stability of the base model, thereby enhancing model performance.


(2) \textbf{Research on the Feature Generalization of Base Models in Generalized Few-Shot Learning.} This paper delves into the issue of poor model generalization ability caused by imbalanced feature distributions encountered during the training of base models, and proposes a self-distillation method based on relation features. Initially, the paper designs two heuristic experiments to explore the relationship between a model's generalization ability and feature distribution characteristics, and discusses methods to improve the balance of feature distribution. Subsequently, this paper introduces a self-distillation training framework. The framework acts on the pre-training stage to improve the feature distribution properties of the model by training the randomly initialized model with self-distillation. Furthermore, the paper introduces a relational feature construction module that converts high-dimensional image features into low-dimensional relational features for distillation, thus addressing the issue of limited learning ability in the self-distillation training framework. Finally, extensive validation experiments are conducted to demonstrate that the proposed training framework effectively improves the feature distribution of the base model, thereby enhancing the model's generalization performance and accuracy.

关键词小样本学习 广义小样本学习 泛化性能 元学习 自蒸馏训练框架
语种中文
七大方向——子方向分类图像视频处理与分析
文献类型学位论文
条目标识符http://ir.ia.ac.cn/handle/173211/57596
专题毕业生_硕士学位论文
推荐引用方式
GB/T 7714
朱晓萌. 小样本学习特征泛化性研究[D],2024.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
小样本学习特征泛化性研究.pdf(6382KB)学位论文 限制开放CC BY-NC-SA
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[朱晓萌]的文章
百度学术
百度学术中相似的文章
[朱晓萌]的文章
必应学术
必应学术中相似的文章
[朱晓萌]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。