EDP: An Efficient Decomposition and Pruning Scheme for Convolutional Neural Network Compression
Ruan, Xiaofeng1,2; Liu, Yufan1,2; Yuan, Chunfeng1; Li, Bing1,4; Hu, Weiming1,2,3; Li, Yangxi5; Maybank, Stephen6
发表期刊IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
ISSN2162-237X
2020
卷号32期号:0页码:0
通讯作者Yuan, Chunfeng(cfyuan@nlpr.ia.ac.cn) ; Li, Bing(bli@nlpr.ia.ac.cn)
摘要

Model compression methods have become popular in recent years, which aim to alleviate the heavy load of deep neural networks (DNNs) in real-world applications. However, most of the existing compression methods have two limitations: 1) they usually adopt a cumbersome process, including pertaining, training with a sparsity constraint, pruning/decomposition, and fine-tuning. Moreover, the last three stages are usually iterated multiple times. 2) The models are pretrained under explicit sparsity or low-rank assumptions, which are difficult to guarantee wide appropriateness. In this article, we propose an efficient decomposition and pruning (EDP) scheme via constructing a compressed-aware block that can automatically minimize the rank of the weight matrix and identify the redundant channels. Specifically, we embed the compressed-aware block by decomposing one network layer into two layers: a new weight matrix layer and a coefficient matrix layer. By imposing regularizers on the coefficient matrix, the new weight matrix learns to become a low-rank basis weight, and its corresponding channels become sparse. In this way, the proposed compressedaware block simultaneously achieves low-rank decomposition and channel pruning by only one single data-driven training stage. Moreover, the network of architecture is further compressed and optimized by a novel Pruning & Merging (PM) module which prunes redundant channels and merges redundant decomposed layers. Experimental results (17 competitors) on different data sets and networks demonstrate that the proposed EDP achieves a high compression ratio with acceptable accuracy degradation and outperforms state-of-the-arts on compression rate, accuracy, inference time, and run-time memory.

关键词Data-driven low-rank decomposition model compression and acceleration structured pruning
DOI10.1109/TNNLS.2020.3018177
收录类别SCI
语种英语
资助项目National Key Research and Development Program of China[2018AAA0102802] ; National Key Research and Development Program of China[2018AAA0102803] ; National Key Research and Development Program of China[2018AAA0102800] ; National Key Research and Development Program of China[2018YFC0823003] ; National Key Research and Development Program of China[2017YFB1002801] ; Natural Science Foundation of China[61902401] ; Natural Science Foundation of China[61972071] ; Natural Science Foundation of China[61751212] ; Natural Science Foundation of China[61721004] ; Natural Science Foundation of China[61972397] ; Natural Science Foundation of China[61772225] ; Natural Science Foundation of China[61906052] ; Natural Science Foundation of China[U1803119] ; NSFC-General Technology Collaborative Fund for basic research[U1636218] ; NSFC-General Technology Collaborative Fund for basic research[U1936204] ; NSFC-General Technology Collaborative Fund for basic research[U1736106] ; Beijing Natural Science Foundation[L172051] ; Beijing Natural Science Foundation[JQ18018] ; Beijing Natural Science Foundation[L182058] ; CAS Key Research Program of Frontier Sciences[QYZDJSSW-JSC040] ; CAS External Cooperation Key Project ; NSF of Guangdong[2018B030311046] ; Youth Innovation Promotion Association, CAS
项目资助者National Key Research and Development Program of China ; Natural Science Foundation of China ; NSFC-General Technology Collaborative Fund for basic research ; Beijing Natural Science Foundation ; CAS Key Research Program of Frontier Sciences ; CAS External Cooperation Key Project ; NSF of Guangdong ; Youth Innovation Promotion Association, CAS
WOS研究方向Computer Science ; Engineering
WOS类目Computer Science, Artificial Intelligence ; Computer Science, Hardware & Architecture ; Computer Science, Theory & Methods ; Engineering, Electrical & Electronic
WOS记录号WOS:000704111000021
出版者IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
七大方向——子方向分类机器学习
引用统计
被引频次:23[WOS]   [WOS记录]     [WOS相关记录]
文献类型期刊论文
条目标识符http://ir.ia.ac.cn/handle/173211/44804
专题多模态人工智能系统全国重点实验室_视频内容安全
通讯作者Yuan, Chunfeng; Li, Bing
作者单位1.National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences
2.School of Artificial Intelligence, University of Chinese Academy of Sciences
3.CAS Center for Excellence in Brain Science and Intelligence Technology
4.PeopleAI Inc.
5.National Computer Network Emergency Response Technical Team/Coordination Center of China
6.Department of Computer Science and Information Systems, Birkbeck College, University of London
第一作者单位模式识别国家重点实验室
通讯作者单位模式识别国家重点实验室
推荐引用方式
GB/T 7714
Ruan, Xiaofeng,Liu, Yufan,Yuan, Chunfeng,et al. EDP: An Efficient Decomposition and Pruning Scheme for Convolutional Neural Network Compression[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS,2020,32(0):0.
APA Ruan, Xiaofeng.,Liu, Yufan.,Yuan, Chunfeng.,Li, Bing.,Hu, Weiming.,...&Maybank, Stephen.(2020).EDP: An Efficient Decomposition and Pruning Scheme for Convolutional Neural Network Compression.IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS,32(0),0.
MLA Ruan, Xiaofeng,et al."EDP: An Efficient Decomposition and Pruning Scheme for Convolutional Neural Network Compression".IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 32.0(2020):0.
条目包含的文件 下载所有文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
EDP_An Efficient Dec(3625KB)期刊论文作者接受稿开放获取CC BY-NC-SA浏览 下载
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Ruan, Xiaofeng]的文章
[Liu, Yufan]的文章
[Yuan, Chunfeng]的文章
百度学术
百度学术中相似的文章
[Ruan, Xiaofeng]的文章
[Liu, Yufan]的文章
[Yuan, Chunfeng]的文章
必应学术
必应学术中相似的文章
[Ruan, Xiaofeng]的文章
[Liu, Yufan]的文章
[Yuan, Chunfeng]的文章
相关权益政策
暂无数据
收藏/分享
文件名: EDP_An Efficient Decomposition and Pruning Scheme for Convolutional Neural Network Compression.pdf
格式: Adobe PDF
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。