Block Convolution: Towards Memory-Efficient Inference of Large-Scale CNNs on FPGA

	Block Convolution: Towards Memory-Efficient Inference of Large-Scale CNNs on FPGA
	Li, Gang1,2 ; Liu, Zejian 1,3; Li, Fanrong1,3 ; Cheng, Jian1,3,4
发表期刊	IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
	2021-05-21
期号	2021.5 页码:1-1
摘要	Deep convolutional neural networks have achieved remarkable progress in recent years. However, the large vol ume of intermediate results generated during inference poses a signifificant challenge to the accelerator design for resource constraint FPGA. Due to the limited on-chip storage, partial results of intermediate layers are frequently transferred back and forth between on-chip memory and off-chip DRAM, leading to a non-negligible increase in latency and energy consumption. In this paper, we propose block convolution, a hardware-friendly, simple, yet effificient convolution operation that can completely avoid the off-chip transfer of intermediate feature maps at run time. The fundamental idea of block convolution is to eliminate the dependency of feature map tiles in the spatial dimension when spatial tiling is used, which is realized by splitting a feature map into independent blocks so that convolution can be performed separately on individual blocks. We conduct extensive experiments to demonstrate the effificacy of the proposed block convolution on both the algorithm side and the hardware side. Specififically, we evaluate block convolution on 1) VGG-16, ResNet- 18, ResNet-50, and MobileNet-V1 for ImageNet classifification task; 2) SSD, FPN for COCO object detection task, and 3) VDSR for Set5 single image super-resolution task. Experimental results demonstrate that comparable or higher accuracy can be achieved with block convolution. We also showcase two CNN accelerators via algorithm/hardware co-design based on block convolution on memory-limited FPGAs, and evaluation shows that both accelerators substantially outperform the baseline without off chip transfer of intermediate feature maps.
关键词	block convolution memory-efficient off-chip transfer fpga cnn accelerator
收录类别	SCI
语种	英语
七大方向——子方向分类	其他
国重实验室规划方向分类	其他
是否有论文关联数据集需要存交	否
文献类型	期刊论文
条目标识符	http://ir.ia.ac.cn/handle/173211/47034
专题	复杂系统认知与决策实验室_高效智能计算与学习
通讯作者	Cheng, Jian
作者单位	1.Institute of Automation, Chinese Academy of Sciences 2.School of Artificial Intelligence, University of Chinese Academy of Sciences 3.School of Future Technology, University of Chinese Academy of Sciences 4.Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences
第一作者单位	中国科学院自动化研究所
通讯作者单位	中国科学院自动化研究所
推荐引用方式 GB/T 7714	Li, Gang,Liu, Zejian,Li, Fanrong,et al. Block Convolution: Towards Memory-Efficient Inference of Large-Scale CNNs on FPGA[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,2021(2021.5):1-1.
APA	Li, Gang,Liu, Zejian,Li, Fanrong,&Cheng, Jian.(2021).Block Convolution: Towards Memory-Efficient Inference of Large-Scale CNNs on FPGA.IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems(2021.5),1-1.
MLA	Li, Gang,et al."Block Convolution: Towards Memory-Efficient Inference of Large-Scale CNNs on FPGA".IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems .2021.5(2021):1-1.

条目包含的文件		下载所有文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
Block Convolution.pd（6174KB）	期刊论文	作者接受稿	开放获取	CC BY-NC-SA	浏览下载