Towards Mixed-Precision Quantization of Neural Networks via Constrained Optimization
Weihan Chen; Peisong Wang; Jian Cheng
2021-10
会议名称International Conference on Computer Vision (ICCV)
会议日期2021-10-11
会议地点线上举办
摘要

Quantization is a widely used technique to compress and accelerate deep neural networks. However, conventional quantization methods use the same bit-width for all (or most of) the layers, which often suffer significant accuracy degradation in the ultra-low precision regime and ignore the fact that emergent hardware accelerators begin to support mixed-precision computation. Consequently, we present a novel and principled framework to solve the mixed-precision quantization problem in this paper. Briefly speaking, we first formulate the mixed-precision quantization as a discrete constrained optimization problem. Then, to make the optimization tractable, we approximate the objective function with second-order Taylor expansion and propose an efficient approach to compute its Hessian matrix. Finally, based on the above simplification, we show that the original problem can be reformulated as a Multiple-Choice Knapsack Problem (MCKP) and propose a greedy search algorithm to solve it efficiently. Compared with existing mixed-precision quantization works, our method is derived in a principled way and much more computationally efficient. Moreover, extensive experiments conducted on the
ImageNet dataset and various kinds of network architectures also demonstrate its superiority over existing uniform and mixed-precision quantization approaches.

收录类别EI
七大方向——子方向分类AI芯片与智能计算
国重实验室规划方向分类智能计算与学习
是否有论文关联数据集需要存交
文献类型会议论文
条目标识符http://ir.ia.ac.cn/handle/173211/52065
专题复杂系统认知与决策实验室_高效智能计算与学习
通讯作者Jian Cheng
作者单位1.NLPR & AIRIA, Institute of Automation, Chinese Academy of Sciences
2.School of Artificial Intelligence, University of Chinese Academy of Sciences
推荐引用方式
GB/T 7714
Weihan Chen,Peisong Wang,Jian Cheng. Towards Mixed-Precision Quantization of Neural Networks via Constrained Optimization[C],2021.
条目包含的文件 下载所有文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
Chen_Towards_Mixed-P(696KB)会议论文 开放获取CC BY-NC-SA浏览 下载
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Weihan Chen]的文章
[Peisong Wang]的文章
[Jian Cheng]的文章
百度学术
百度学术中相似的文章
[Weihan Chen]的文章
[Peisong Wang]的文章
[Jian Cheng]的文章
必应学术
必应学术中相似的文章
[Weihan Chen]的文章
[Peisong Wang]的文章
[Jian Cheng]的文章
相关权益政策
暂无数据
收藏/分享
文件名: Chen_Towards_Mixed-Precision_Quantization_of_Neural_Networks_via_Constrained_Optimization_ICCV_2021_paper.pdf
格式: Adobe PDF
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。