Knowledge Commons of Institute of Automation,CAS
Block Convolution: Towards Memory-Efficient Inference of Large-Scale CNNs on FPGA | |
Li, Gang1,2![]() ![]() ![]() | |
发表期刊 | IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
![]() |
2021-05-21 | |
期号 | 2021.5页码:1-1 |
摘要 | Deep convolutional neural networks have achieved
remarkable progress in recent years. However, the large vol
ume of intermediate results generated during inference poses
a signifificant challenge to the accelerator design for resource
constraint FPGA. Due to the limited on-chip storage, partial
results of intermediate layers are frequently transferred back and
forth between on-chip memory and off-chip DRAM, leading to
a non-negligible increase in latency and energy consumption. In
this paper, we propose block convolution, a hardware-friendly,
simple, yet effificient convolution operation that can completely
avoid the off-chip transfer of intermediate feature maps at run
time. The fundamental idea of block convolution is to eliminate
the dependency of feature map tiles in the spatial dimension
when spatial tiling is used, which is realized by splitting a
feature map into independent blocks so that convolution can be
performed separately on individual blocks. We conduct extensive
experiments to demonstrate the effificacy of the proposed block
convolution on both the algorithm side and the hardware side.
Specififically, we evaluate block convolution on 1) VGG-16, ResNet-
18, ResNet-50, and MobileNet-V1 for ImageNet classifification
task; 2) SSD, FPN for COCO object detection task, and 3) VDSR
for Set5 single image super-resolution task. Experimental results
demonstrate that comparable or higher accuracy can be achieved
with block convolution. We also showcase two CNN accelerators
via algorithm/hardware co-design based on block convolution
on memory-limited FPGAs, and evaluation shows that both
accelerators substantially outperform the baseline without off
chip transfer of intermediate feature maps. |
关键词 | block convolution memory-efficient off-chip transfer fpga cnn accelerator |
收录类别 | SCI |
语种 | 英语 |
七大方向——子方向分类 | 其他 |
国重实验室规划方向分类 | 其他 |
是否有论文关联数据集需要存交 | 否 |
文献类型 | 期刊论文 |
条目标识符 | http://ir.ia.ac.cn/handle/173211/47034 |
专题 | 复杂系统认知与决策实验室_高效智能计算与学习 |
通讯作者 | Cheng, Jian |
作者单位 | 1.Institute of Automation, Chinese Academy of Sciences 2.School of Artificial Intelligence, University of Chinese Academy of Sciences 3.School of Future Technology, University of Chinese Academy of Sciences 4.Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences |
第一作者单位 | 中国科学院自动化研究所 |
通讯作者单位 | 中国科学院自动化研究所 |
推荐引用方式 GB/T 7714 | Li, Gang,Liu, Zejian,Li, Fanrong,et al. Block Convolution: Towards Memory-Efficient Inference of Large-Scale CNNs on FPGA[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,2021(2021.5):1-1. |
APA | Li, Gang,Liu, Zejian,Li, Fanrong,&Cheng, Jian.(2021).Block Convolution: Towards Memory-Efficient Inference of Large-Scale CNNs on FPGA.IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems(2021.5),1-1. |
MLA | Li, Gang,et al."Block Convolution: Towards Memory-Efficient Inference of Large-Scale CNNs on FPGA".IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems .2021.5(2021):1-1. |
条目包含的文件 | 下载所有文件 | |||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 | ||
Block Convolution.pd(6174KB) | 期刊论文 | 作者接受稿 | 开放获取 | CC BY-NC-SA | 浏览 下载 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论