Towards efficient full 8-bit integer DNN online training on resource-limited devices without batch normalization
Yang, Yukuan1; Chi, Xiaowei2; Deng, Lei1; Yan, Tianyi3; Gao, Feng4; Li, Guoqi1,5
发表期刊NEUROCOMPUTING
ISSN0925-2312
2022-10-28
卷号511页码:175-186
通讯作者Li, Guoqi(liguoqi@mail.tsinghua.edu.cn)
摘要Huge computational costs brought by convolution and batch normalization (BN) have caused great challenges for the online training and corresponding applications of deep neural networks (DNNs), especially in resource-limited devices. Existing works only focus on the convolution or BN acceleration and no solution can alleviate both problems with satisfactory performance. Online training has gradually become a trend in resource-limited devices like mobile phones while there is still no complete technical scheme with acceptable model performance, processing speed, and computational cost. In this research, an efficient online-training quantization framework termed EOQ for abbreviation is proposed by combining Fixup initialization and a novel quantization scheme for the online training in resource-limited devices. Based on the proposed framework, we have successfully realized full 8-bit integer network training and removed BN in large-scale DNNs. Especially, weight updates are quantized to 8-bit integers for the first time. Theoretical analyses of EOQ utilizing Fixup initialization for removing BN have been further given using a novel Block Dynamical Isometry theory with weaker assumptions. Benefiting from rational quantization strategies and the absence of BN, the full 8-bit networks based on EOQ can achieve state-of-the-art accuracy and immense advantages in computational cost and processing speed. Experiments show that the 8-bit EOQ networks achieve 2.78%, 3.85%, and 4.31% accuracy improvements compared with existing full 8-bit integer networks in ResNet-18/34/50. At the same time, the 8-bit EOQ networks can improve the computing speed greatly, and decrease the power consumption and circuit area by about an order of magnitude compared with 32-bit floating-point vanilla networks. In addition to the huge advantages brought by quantization in convolution operations, 8-bit networks based on EOQ without BN can realize >66x lower in power, >13 x faster in the processing speed compared with the traditional 32-bit floating-point BN in the inference process. What's more, the design of deep learning chips can be profoundly simplified in the absence of unfriendly square root operations in BN. Beyond this, EOQ has been evidenced to be more advantageous in small-batch online training with fewer batch samples. In summary, the EOQ framework is specially designed for reducing the high cost of convolution and BN in network training, demonstrating a broad application prospect of online training in resource-limited devices. (C) 2022 Published by Elsevier B.V.
关键词Full 8-bit quantization Network without batch normalization Small batch Online training Resource-limited devices
DOI10.1016/j.neucom.2022.08.045
关键词[WOS]DEEP NEURAL-NETWORKS ; MEMORY
收录类别SCI
语种英语
资助项目National Key RD program[2018YFE0200200] ; National Key RD program[2018AAA0102604] ; Beijing Natural Science Fundation[JQ21015] ; Beijing Academy of Artificial Intelligence (BAAI) ; Science and Technology Major Project of Guangzhou[202007030006] ; Pengcheng Lab
项目资助者National Key RD program ; Beijing Natural Science Fundation ; Beijing Academy of Artificial Intelligence (BAAI) ; Science and Technology Major Project of Guangzhou ; Pengcheng Lab
WOS研究方向Computer Science
WOS类目Computer Science, Artificial Intelligence
WOS记录号WOS:000871948700015
出版者ELSEVIER
引用统计
文献类型期刊论文
条目标识符http://ir.ia.ac.cn/handle/173211/50506
专题复杂系统认知与决策实验室_听觉模型与认知计算
通讯作者Li, Guoqi
作者单位1.Tsinghua Univ, Ctr Brain Inspired Comp Res, Dept Precis Instrument, Beijing 100084, Peoples R China
2.Beijing Univ Posts & Telecommun, Int Sch, Beijing 100876, Peoples R China
3.Beijing Inst Technol, Sch Life Sci, Beijing 100081, Peoples R China
4.Capital Med Univ, Beijing Tiantan Hosp, Dept Intervent Neuroradiol, Beijing 100070, Peoples R China
5.Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China
通讯作者单位中国科学院自动化研究所
推荐引用方式
GB/T 7714
Yang, Yukuan,Chi, Xiaowei,Deng, Lei,et al. Towards efficient full 8-bit integer DNN online training on resource-limited devices without batch normalization[J]. NEUROCOMPUTING,2022,511:175-186.
APA Yang, Yukuan,Chi, Xiaowei,Deng, Lei,Yan, Tianyi,Gao, Feng,&Li, Guoqi.(2022).Towards efficient full 8-bit integer DNN online training on resource-limited devices without batch normalization.NEUROCOMPUTING,511,175-186.
MLA Yang, Yukuan,et al."Towards efficient full 8-bit integer DNN online training on resource-limited devices without batch normalization".NEUROCOMPUTING 511(2022):175-186.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Yang, Yukuan]的文章
[Chi, Xiaowei]的文章
[Deng, Lei]的文章
百度学术
百度学术中相似的文章
[Yang, Yukuan]的文章
[Chi, Xiaowei]的文章
[Deng, Lei]的文章
必应学术
必应学术中相似的文章
[Yang, Yukuan]的文章
[Chi, Xiaowei]的文章
[Deng, Lei]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。