面向视觉注意力模型的定点量化加速算法研究
李哲鑫
2022-05-20
页数66
学位类型硕士
中文摘要
近年来,基于注意力机制的深度神经网络在计算机视觉、自然语言处理、语音识别、多模态识别等多个领域取得了巨大的成功,甚至在某些任务上展现出了远超传统神经网络的性能。在视觉任务中,视觉注意力模型正在代替卷积神经网络成为学界和业界的主要研究对象。然而,相比于传统卷积神经网络,视觉注意
力模型由于其与输入尺度呈平方级增长的计算量,往往在无法一些高分辨率或者密集预测的视觉任务中实际运用,设计可以高效计算的视觉注意力模型已成为了一个重要的研究方向。本文针对视觉注意力模型的定点量化加速问题,在离线量化、量化训练以及混合精度量化三个方向做了探索。本文的主要贡献如下:
针对视觉注意力模型快速量化部署的需求,提出了一种基于最小化量化误差的离线量化策略,使用均方误差建模量化误差,并运用迭代法求解该优化问题。通过这种量化方式可以快速而准确地求得最优的量化步长,量化得到的8比特模型在 ImagetNet 上的性能超越了之前的视觉注意力模型离线量化的工作。
尽管上述离线量化算法的部署时间较短,但其在低比特量化下会引入较大的量化误差。针对这个问题,本文进一步提出了一种基于可微量化参数的量化训练方法以提高视觉注意力模型在低比特位下的精度表现。该方法在 ImageNet 数据集上实现了 4 比特量化的视觉注意力模型精度损失都在 0.5% 内。
以上统一比特位的量化方法对比特位的利用不够充分,导致视觉注意力模型在极低比特下的量化仍存在性能瓶颈。针对这个问题,本文提出了一种基于可微比特位宽的混合精度量化方法,利用 STE 实现比特位宽和量化步长的联合训练,并且提出了可变量化步长以解决联合训练过程中的不稳定问题,还针对视觉注意力模型的多注意力通道机制提出了一种基于通道感知的比特位宽策略,进一步地提升了在低比特下的精度,在 ImageNet 上实现了3比特量化下精度损失在可接受的范围内,超越了先前性能最好的统一量化工作。
英文摘要

In recent years, Transformer-based neuron networks have achieved great success in miscellaneous tasks, such as computer vision, natural language processing, speech recognition and multi-modal machine learning, etc., and even far outperformed traditional neural networks in some tasks. performance. In computer vision, Vision Transformer (ViT) is replacing convolutional neural networks as the main research focus in academia. However, compared with the traditional convolutional neural network, ViT is often unable to be practically used in some high-resolution or dense prediction vision tasks due to its quadratic increase in the amount of computation with the input scale. Designing computationally efficient ViT has become an important research topic. This paper explores acceleration for ViT via fixed-point quantization, including post-quantization, quantization training, and mixed-precision quantization. The main contributions of this paper are as follows:

  • Aiming at the requirement of fast quantization deployment of ViT, we propose a  post-quantization approach based on minimizing the quantization error. The mean square error is used to estimate the quantization error, and an iterative method is used to solve the optimization problem. Through this quantization method, the optimal quantization step size can be quickly and accurately obtained. The performance of the 8-bit quantized model surpasses the previous work of ViT post-quantization on ImageNet.
  • Aiming at the problem of excessive accuracy loss in low-bit quantization of ViT using post-quantization, we propose a differentiable quantization-aware training approach, which uses learnable quantization scales to improve ViT's performance in low-bit quantization. Moreover, we introduce a learnable offset to reduce the quantization error of the GELU activation layer. Last but not the least, a MSE-based algorithm is proposed to initialize the quantization scales and offsets. This method achieves a 4-bit quantization for ViT with accuracy drop less than 0.5\% on the ImageNet dataset.
  • Aiming at the problem that quantization-aware training with uniform bit-widths still suffers too much performance loss at extreme low bits, we propose a mixed-precision quantization approach based on differentiable bit-widths, named Q-ViT. First, we utilize STE to solve the gradient truncation problem of bit-widths, enabling the joint training of bit-widths and quantization scales. Next, we propose a novel technique named switchable scales to solve the instability problem in the joint training process. Finally, we leverage head-wise bit-width for the multi-head mechanism in the self-attention process to further squeeze the size of Q-ViT. Q-ViT achieves 3-bit quantization with a mild accuracy drop on ImageNet, surpassing the previous state-of-the-art uniform quantization work.
关键词定点量化 模型压缩 模型加速 视觉注意力模型 混合精度
语种中文
文献类型学位论文
条目标识符http://ir.ia.ac.cn/handle/173211/48695
专题复杂系统认知与决策实验室_高效智能计算与学习
推荐引用方式
GB/T 7714
李哲鑫. 面向视觉注意力模型的定点量化加速算法研究[D]. 中国科学院自动化研究所. 中国科学院自动化研究所,2022.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
论文终版-带签名.pdf(2261KB)学位论文 开放获取CC BY-NC-SA
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[李哲鑫]的文章
百度学术
百度学术中相似的文章
[李哲鑫]的文章
必应学术
必应学术中相似的文章
[李哲鑫]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。