Deep learning has achieved great success on image classification tasks. However, the reliance on large amounts of labeled data hinders its application to complex and dynamic environments. The collection and annotation of large amounts of labeled data cost huge human and material resources in the natural environment. Zero-shot learning solves the problem of the model’s dependence on large-scale datasets. It is able to classify novel classes with no labeled data. Zero-shot learning shows great significance to the visual cognitive development of models.
As the data of non-Euclidean space, graph structure can not only represent the characteristics of each node but also explicitly reflect the connection relationship between nodes. Therefore, graph structure provides an important reasoning basis for transferring classification ability from base classes to novel classes in zero-shot learning. This paper proposes novel methods to deal with the zero-shot learning problem with graph structure. The main contributions are summarized as follows:
1. A knowledge-experience fusion graph network is proposed to solve the domain shift between the semantic information and the visual problem in zero-shot learning. The model exploits the information in both the semantic and the visual space to improve the classification ability of the novel classes. The idea of the knowledge-experience fusion graph has also achieved a high classification accuracy on the few-shot learning tasks.
2. An online graph inference network is proposed to solve the lack of visual information from the time dimension in zero-shot learning. Inspired by the theory of human cognitive development, the model exploits visual information from the practical application scenario. Besides, with the classifiers generated from the knowledge graph, the model transfers the visual information to the online updating graph, which updates the classification ability continuously.
3. A zero-shot learning model under open set domain adaptation is proposed to solve shifts both in the class and domain space. In this method, adversarial learning is used to align the data distribution between the source domain and target domain, which weakens the influence of scene changes on novel tasks. Besides, it divides the unknown classes in the target domain. Finally, with the support of the knowledge graph, specific classes of the unknown class are divided in detail to realize the visual cognitive development in both the class space and the domain space.
修改评论