CASIA OpenIR  > 精密感知与控制研究中心  > 人工智能与机器学习
分类激活图增强的图像分类算法
杨萌林1,2; 张文生1,2
Source Publication计算机科学与探索
2019-04
Issue00Pages:00
Abstract

分类激活图是一种具有高层语义信息的特征图,其反映了图像中每块区域对分类的响应程度,通过简单的后处理能够可视化模型分类的依据。但是在分类标签监督下,模型很容易陷入局部的判别区域,导致分类激活图出现稀疏、不完整、不连续等问题,而被抑制的区域也能够提供一定的判别信息和语义信息。除此之外,在以往的分类研究中,分类激活图一般仅当做可视化分析的手段,没有进一步的探索和利用。本文在分类激活图可视化功能之上,进一步利用分类激活图提升模型的语义信息,增强模型的分类性能。为了实现这样的目标,本文设计了自动加权的多尺度特征学习方法来挖掘更多的判别区域,并将该多尺度特征与分类激活图结合,设计了多尺度分类激活图生成的方法,直接将生成的分类激活图嵌入到深度神经网络中构成了一种端到端的结构,从而实现分类性能增强的目的。以残差网络ResNet为骨干网络,本文提出了分类激活图增强模型ResNet-CE。在三个公开数据集CIFAR10、CIFAR100和STL10上,对该模型进行了大量的实验。实验表明:ResNet-CE在三个数据集上的分类性能分别达到5.73%,23.85%和15.91%的错误率,与参数量相当的ResNet相比性能有明显的提升,识别的错误率分别降低了0.23%,3.56%和7.96%,并且模型的分类性能优于当前大部分的分类网络。本文提出的分类激活图增强的图像分类算法,能够简单地迁移到已有的分类模型中,提高原有模型的分类性能,同时该算法能够对模型的判断依据进行可视化和解释,这在很多的场景,如医疗影像中的疾病识别,无人驾驶的场景识别等都有一定的应用价值和意义。

Other Abstract

This paper proposed a new image classification algorithm based on the classification activation map (CAM). CAM is a kind of feature map with high-level semantics, which could reflect the response of each area in the image to the classification. The CAM can be as a visualized measure by using simple post-processing method. However, under the supervision of classification labels, the model tends to focus on the local most discriminative areas while ignoring the integrity of the target, resulting in sparse and incontinuity problems, but the suppressed areas can also provide semantic information. In addition, the research of CAM only stayed in the visualization in the classification framework before, without further exploration and utilization. In this paper, the CAM is further explored to enhance the classification. In order to achieve such a goal, this paper first designs an automatic weighted multi-scale feature learning method to mine more discriminative information. Furthermore, the multi-scale feature is combined with the CAM, and a method for directly generating CAM is proposed. The method can embed the CAM into the network to form an end-to-end structure, thus achieving classification performance enhancement. With ResNet as the backbone, this paper proposes an image classification model ResNet-CE. A large number of experiments were performed on three public datasets CIFAR10, CIFAR100 and STL10. The experiments show that: the classification performance of ResNet-CE on these three datasets reaches 5.73%, 23.85% and 15.91% error rate; and compared with the benchmark ResNet, the error rate is obviously lower which decrease by 0.23%, 3.56% and 7.96%, respectively. In addition, the performance of the proposal model is better than most of the current classification networks. The proposal based on CAM enhancement can be easily transferred to the off-the-shelf networks. At the same time, the algorithm can visualize and interpret the judgment of the model. The proposal has certain application value and significance in many applications, such as disease recognition in medical images and target recognition in remote sensing images.

Keyword图像分类 分类激活图 多尺度 可视化 可解释性
Indexed ByCSCD
Language中文
Funding ProjectNational Natural Science Foundation of China[U1636220]
Citation statistics
Document Type期刊论文
Identifierhttp://ir.ia.ac.cn/handle/173211/23939
Collection精密感知与控制研究中心_人工智能与机器学习
中国科学院自动化研究所
Affiliation1.中国科学院自动化研究所
2.中国科学院大学
First Author AffilicationInstitute of Automation, Chinese Academy of Sciences
Recommended Citation
GB/T 7714
杨萌林,张文生. 分类激活图增强的图像分类算法[J]. 计算机科学与探索,2019(00):00.
APA 杨萌林,&张文生.(2019).分类激活图增强的图像分类算法.计算机科学与探索(00),00.
MLA 杨萌林,et al."分类激活图增强的图像分类算法".计算机科学与探索 .00(2019):00.
Files in This Item: Download All
File Name/Size DocType Version Access License
paper.pdf(1105KB)期刊论文作者接受稿开放获取CC BY-NC-SAView Download
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[杨萌林]'s Articles
[张文生]'s Articles
Baidu academic
Similar articles in Baidu academic
[杨萌林]'s Articles
[张文生]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[杨萌林]'s Articles
[张文生]'s Articles
Terms of Use
No data!
Social Bookmark/Share
File name: paper.pdf
Format: Adobe PDF
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.