基于样本生成的小样本图像识别算法研究
罗沁轩
2021-05-26
页数94
学位类型硕士
中文摘要

近年来,深度学习在计算机视觉领域取得了巨大进展。但深度模型的学习能力建立在足够的训练数据之上,当数据不足时将难以发挥原有的性能。然而,现实应用中的很多任务均没有足够的数据支撑训练。比如,在医疗图像识别任务中,对罕见病情的识别高度依赖专家的知识,收集大量标注的样本并不现实。因此,必须寻找手段降低模型对于数据的需求,这类任务一般被称作小样本任务。目前为止,已经有大量的小样本学习算法被提出。这些方法在思路和手段上不尽相同,大致可以分为度量学习方法、元学习方法、图方法和数据扩充方法四种。本文围绕小样本图像识别任务展开,以数据扩充为主要手段,探索并讨论了在小样本任务中借助生成样本提升分类器性能这一思路的可行性。具体地,本文通过提取有限图像中容易被忽略的信息并进行再利用,保证了生成样本的可用性与多样性,从而提升了分类器在样本受限条件下的分类性能。

本文的主要研究内容与贡献如下:

1、提出一种面向小样本识别的基于注意力机制的样本扩充算法。该算法的核心思想是将图像前景与背景特征进行分离重组,期望充分利用图像中原本低价值的背景信息对样本进行扩充,并保证生成样本的可分性与多样性。该算法以注意力机制为主要手段,利用图像掩膜将原始特征划分为前景与背景两部分,其中前景部分决定了图像的类别信息,背景部分能够与前景自由组合并不改变类别性质。分离后的特征经过交换重组可以构成新的样本,但这些样本还需通过特征融合模块进行信息整合,以保证特征空间中样本的分布足够均匀。该算法选择原型网络作为分类器,令任务中有限的带标签样本构成支持集,原型的位置由支持集样本直接决定。而当支持集通过生成的新样本实现扩充后,原型的位置将得到优化,分类性能也会因此上升。实验证明了该算法确实为传统的原型网络带来了性能的提升,在常用数据集mini-ImageNet上达到了77.48% 的准确率,与同时期算法对比达到了SOTA的水平。

2、提出一种面向小样本识别的基于变分推断的样本扩充算法。该算法的核心思想是对数据分布信息进行提取与利用,期望通过少量样本表现出的分布状态生成同类别的新样本,使得生成特征和真实特征拥有相似的分布并且在特征空间中呈聚类关系以达到伪造特征的目的。该算法的关键问题是寻找一种合理的分布表达方式,为此构造了一个低维度的隐空间,并约束同类别的样本在该空间中呈高斯分布,同时利用全连接层构成的编码器与解码器建立了特征空间与隐空间之间的相互映射关系。在隐空间中采样得到的隐向量可以映射回特征空间中,并作为对应类别的新样本参与后续的分类器训练任务,从而使分类器的分类性能更高。实验证明了本算法相比于其他同类算法在经典的小样本任务上基本取得了SOTA的性能,在mini-ImageNet数据集上取得了78.6% 的准确率。模型中各模块皆符合设计预期,具备有效性和鲁棒性。

英文摘要

In recent years, deep learning has made great progress in the field of computer vision. However, deep learning models rely on large amounts of training data and can not perform well without enough data. There are many tasks in reality that are faced with the problem of insufficient training data. For instance, medical images of rare diseases are difficult to be recognized in large quantities because the accurate judgment of a medical image needs very professional knowledge from experts, and reducing the model's dependence on the supervised information is necessary for such few-shot tasks. Existing few-shot algorithms are varied and can be divided into four categories: metric learning, meta-learning, graph learning, and data augmentation. This paper focuses on few-shot image recognition tasks and discusses the feasibility of the idea of improving the performance of the classifier by generating features. Specifically, to guarantee the availability and diversity of generated features, this paper extracts and reuses the information in limited images that is easy to be ignored and finally improves the recognition accuracy of the classifier under the condition of limited images.

The main research contents and contributions of this paper are as follows.

1. This paper proposes a feature hallucination algorithm based on the attention mechanism. The core idea of this method is to separate and exchange the foreground features and background features of images. It is expected to make full use of the low-value background information in the image to generate samples and ensure the separability and diversity of the generated samples. The algorithm takes the attention mechanism as the main means and divides the original features into the foreground and background parts based on masks. The foreground part determines the category information of the image, and the background part can combine with the foreground freely without changing the category property. The separated features can be transformed into new samples after exchange and reorganization, but these samples need to be integrated by a feature fusion module to ensure that the distribution of samples in feature space is uniform enough. In this algorithm, the prototype network is selected as the classifier, and the position of the prototype is directly determined by the support set which is composed of a limited number of labeled samples. Obviously, when the support set is augmented by the generated samples, the position of the prototype will be optimized and the classification performance will be improved. Experimental results show that this method does improve performance compared with the traditional prototype network and obtains state-of-the-art results (77.48% top-1 accuracy on mini-ImageNet dataset) among other contemporaneous methods.

2. This paper proposes a feature hallucination algorithm based on variational inference. The core idea of this method is to extract and utilize the distribution information of sample clusters and to generate samples based on it. Naturally, the generated features need to have a similar distribution shape to real features and are clustered in the feature space. To find a reasonable expression of distribution information, this method defines a generating space for each category in the latent space that satisfies the regular distribution to constraint samples. Meanwhile, it uses a pair of an encoder and a decoder to construct the mapping relationship between the feature space and latent space. On the basis of these settings, the latent vectors sampled according to the distribution in the hidden space can be mapped back to the feature space, and participate in the classification tasks as augmented samples of the corresponding categories, to make the classification boundary of the classifier sharper. Experimental results show that this method has obtained state-of-the-art results (78.6% top-1 accuracy on mini-ImageNet dataset) among other representative methods, and every module is effective and robust as expected.

关键词图像识别,小样本学习,特征生成,注意力机制,变分推断
语种中文
七大方向——子方向分类目标检测、跟踪与识别
文献类型学位论文
条目标识符http://ir.ia.ac.cn/handle/173211/44709
专题多模态人工智能系统全国重点实验室_先进时空数据分析与学习
推荐引用方式
GB/T 7714
罗沁轩. 基于样本生成的小样本图像识别算法研究[D]. 中科院自动化所智能化大厦三楼第三会议室. 中国科学院自动化研究所,2021.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
罗沁轩_基于样本生成的小样本图像识别算法(6442KB)学位论文 开放获取CC BY-NC-SA
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[罗沁轩]的文章
百度学术
百度学术中相似的文章
[罗沁轩]的文章
必应学术
必应学术中相似的文章
[罗沁轩]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。