基于图结构的零样本图像分类方法研究

CASIA OpenIR > 毕业生 > 硕士学位论文

	基于图结构的零样本图像分类方法研究
	张昕悦
	2023-05-25
页数	86
学位类型	硕士
中文摘要	近年来，深度学习在图像分类领域取得了许多引人瞩目的研究成果。然而，深度学习模型对大规模训练数据集的依赖问题，限制了其在复杂动态场景下的应用。在一些真实场景中，采集和标注大规模数据集需要耗费大量的人力物力，甚至由于政策保密性等原因而难以完成大规模数据集的采集任务。零样本学习方法可以在没有新类别训练样本的情况下，使模型具备对新类别的分类能力，从而解决模型对大规模数据集的依赖问题。图结构作为非欧几里得空间的数据类型，不仅可以表示节点的特征，还可以显式地体现节点之间的连接关系。在零样本学习任务中，这种结构化的表现形式为模型提供了重要的推理依据，有助于推理过程中新旧类别之间信息的传播。因此本文对基于图结构的零样本图像分类方法进行了研究，具体研究成果如下： 1. 针对零样本学习利用辅助信息解决视觉任务所带来的信息域偏移问题，本文提出了基于混合图谱的零样本图像分类方法。该方法利用类别的语义描述信息和视觉经验构建混合图谱，以提高零样本学习模型的分类性能。同时，混合图谱的思想在零样本的泛化问题-小样本图像分类任务上，也取得了较好的分类效果。模型分别在粗粒度数据集上和细粒度数据集上较其他零样本学习模型体现了分类性能的提升。 2. 为了进一步缓解语义空间推理所带来的信息域偏移问题，本文从时间维度出发，提出了基于图结构的持续性零样本学习图像分类方法。受人类认知发展理论的启发，该方法从模型所应用的实际场景中收集序贯流入的图像信息，并利用基于知识图谱推理得到的分类器，将无标注图像信息转化为具备软标签的在线更新图结构，从而实现模型分类能力的持续性在线更新。实验验证了在线学习方法对模型零样本分类准确率的提升。 3. 针对新旧任务之间存在的场景域偏移问题，本文提出了在开放域场景下的零样本图像分类方法。该方法基于对抗学习技术，对齐源域和目标域之间的数据分布，以减弱场景变化对新任务的影响。同时划分出目标域中的未知类别，并进一步以图结构的形式引入类别的语义信息，对未知类别进行细粒度划分，实现了模型在类别域和场景域分类能力的提升。实验验证了模型较零样本分类模型和域自适应模型在开放域场景下具备更好的分类性能。
英文摘要	Deep learning has achieved great success on image classification tasks. However, the reliance on large amounts of labeled data hinders its application to complex and dynamic environments. The collection and annotation of large amounts of labeled data cost huge human and material resources in the natural environment. Zero-shot learning solves the problem of the model’s dependence on large-scale datasets. It is able to classify novel classes with no labeled data. Zero-shot learning shows great significance to the visual cognitive development of models. As the data of non-Euclidean space, graph structure can not only represent the characteristics of each node but also explicitly reflect the connection relationship between nodes. Therefore, graph structure provides an important reasoning basis for transferring classification ability from base classes to novel classes in zero-shot learning. This paper proposes novel methods to deal with the zero-shot learning problem with graph structure. The main contributions are summarized as follows: 1. A knowledge-experience fusion graph network is proposed to solve the domain shift between the semantic information and the visual problem in zero-shot learning. The model exploits the information in both the semantic and the visual space to improve the classification ability of the novel classes. The idea of the knowledge-experience fusion graph has also achieved a high classification accuracy on the few-shot learning tasks. 2. An online graph inference network is proposed to solve the lack of visual information from the time dimension in zero-shot learning. Inspired by the theory of human cognitive development, the model exploits visual information from the practical application scenario. Besides, with the classifiers generated from the knowledge graph, the model transfers the visual information to the online updating graph, which updates the classification ability continuously. 3. A zero-shot learning model under open set domain adaptation is proposed to solve shifts both in the class and domain space. In this method, adversarial learning is used to align the data distribution between the source domain and target domain, which weakens the influence of scene changes on novel tasks. Besides, it divides the unknown classes in the target domain. Finally, with the support of the knowledge graph, specific classes of the unknown class are divided in detail to realize the visual cognitive development in both the class space and the domain space.
关键词	零样本学习图像分类图神经网络知识图谱
语种	中文
七大方向——子方向分类	目标检测、跟踪与识别
国重实验室规划方向分类	虚实融合与迁移学习
是否有论文关联数据集需要存交	否
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/51920
专题	毕业生_硕士学位论文
推荐引用方式 GB/T 7714	张昕悦. 基于图结构的零样本图像分类方法研究[D],2023.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
基于图结构的零样本图像分类方法研究_答辩（6231KB）	学位论文		限制开放	CC BY-NC-SA