基于低秩分解和通道剪枝的卷积神经网络压缩

CASIA OpenIR > 多模态人工智能系统全国重点实验室 > 模式分析与学习

	基于低秩分解和通道剪枝的卷积神经网络压缩
	尉德利
	2021-05-12
页数	82
学位类型	硕士
中文摘要	卷积神经网络在众多模式识别任务上表现出优异的性能，然而因受制于占用储存空间多、计算量大和运行内存量大等弊端，难以直接部署到计算资源有限的设备上。模型压缩加速技术能够在不过多牺牲模型识别性能的前提下，通过削减模型的计算量和参数量，来减少资源消耗和提升运行速度。本文研究了基于消除通道维度冗余的低秩分解和通道剪枝的卷积神经网络模型压缩加速方法，提出了低秩选择算法和适应性秩惩罚的低秩求解算法，以及对通道剪枝算法进行了分阶段的方法评价与改进。基于上述两项成果，本文设计了一个集成两者功能的压缩工具。本文的具体工作包含以下3个部分： • 对于低秩分解的模型压缩，从秩配置和分解后模型的求解两个角度展开研究：1) 为了在给定参数量(或计算量) 约束下确定全局最优的秩配置，提出了一种基于动态规划的搜索策略；2) 提出了一种基于学习的求解方法——适应性的秩惩罚(Adaptive Rank Penalty，ARP)，ARP 能够诱导出模型的低秩结构，所获得的低秩模型可以实现几乎无损的低秩分解。在CIFAR-10 或者ILSVRC-2012数据集上的实验验证了两个方法的有效性。 • 针对通道剪枝的模型压缩，总结了剪枝流程中预训练、剪枝和fine-tuning三个阶段的实施策略，通过综合性的对比实验，得到了不同策略的优劣评估以及使用技巧的总结。此外，为了降低超参数敏感度，将预训练阶段的稀疏化的Hoyer 正则项改进为一种与压缩量约束有关的Hoyer-Clip 正则项，有效地降低了对参数调整的依赖。 • 开发了一个集成低秩分解和通道剪枝功能的压缩工具。其中，低秩分解版块包含了自研的动态规划的搜索策略和ARP 算法，通道剪枝版块可以灵活解析复杂结构的模型，且提供了通用的模型导出算法。此工具用于英文手写字符串识别模型的压缩任务，模型实现了3.06 倍的实际加速的同时，精度下降量为0.6%。
英文摘要	Convolutional Neural Networks (CNN) have yielded superior performance in many vision-based applications. However, it is difficult to deploy CNN on resource-limited devices due to their huge parameter storage, heavy run-time memory and computation. Compression techniques of CNNs aim at reducing parameters and computation with little accuracy drop. Among these techniques, this thesis considers low-rank decomposition and channel pruning based methods. Specifically, a new rank selection algorithm and a novel low-rank decomposition solution method are proposed, and several algorithms for the three stages of channel pruning pipeline are evaluated and improved. Also, this thesis designs a model compression tool comprising both low-rank decomposition and channel pruning methods. The main contributions are as follows: • The issues of rank selection and decomposition solution of low-rank decomposition based compression are studied. As result, a dynamic programming search algorithm is proposed to determine the globally optimal rank policy under given resource budget, and a novel learning-based method named Adaptive Rank Penalty (ARP) is proposed for low-rank decomposition solution. The effectiveness of both methods is verified by compression experiments on CIFAR-10 or ILSVRC-2012. • Under the channel pruning framework, the algorithms of pre-training, pruning and fine-tuning steps are analyzed and comprehensively evaluated to give useful guidances. In addition, a budget-aware sparsity regularizer called Hoyer-Clip is proposed to reduce the sensitivity of hyper-parameters, and its effectiveness is verified by compression experiments. • A model compression tool comprising low-rank decomposition and channel pruning methods is developed. The part of low-rank decomposition contains search algorithm and ARP algorithm that we propose, and the part of channel pruning contains functions for flexibly analysing complex structure models and deploying slimmed models. The tool was applied to English handwritten text recognition model compression, and achieves 3.06 times speedup with 0.6% accuracy drop.
关键词	卷积神经网络压缩低秩分解基于学习的低秩分解通道剪枝模型压缩工具
学科领域	人工智能 ; 计算机应用
学科门类	工学 ; 工学::计算机科学与技术（可授工学、理学学位）
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/45026
专题	多模态人工智能系统全国重点实验室_模式分析与学习
推荐引用方式 GB/T 7714	尉德利. 基于低秩分解和通道剪枝的卷积神经网络压缩[D]. 中国科学院自动化研究所智能化大厦609. 中国科学院大学人工智能学院,2021.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
尉德利-学位论文-终稿-v2.pdf（3557KB）	学位论文		开放获取	CC BY-NC-SA