Accelerate Convolutional Neural Network with a customized VLIW DSP
Guo Peng1,2; Ma Hong1; Guo Ruoshan1; Liu Zhuang1; Li Pin1; Wang Donglin1
2018-08
会议名称9th IEEE International Conference on Software Engineering and Service Science (ICSESS 2018)
会议日期2018-10
会议地点北京
摘要

Convolutional neural networks (CNNs) have achieved outstanding performance in many domains. However, the stateof-the-art CNN models also introduce massive computation and huge memory footprint. To facilitate the deployment of CNN on embedded platforms, many existing studies focus on designing dedicated hardware accelerators. But there still exists many legacy DSP-based platforms which can also be exploited to accelerate the inference of CNN. In this work, we study the computation of CNN on MaPU, which is a customized VLIW DSP. MaPU is empowered with a multi-granularity parallel memory system and a flexible program model, which is very suitable for compute-intensive tasks. Through an in-depth analysis of CNN’s parallelism and the hardware architecture, we propose a kernel-expanded scheduling scheme, which can handle different kernel size uniformly. Based on our experiment on a face recognition network, MaPU achieves great performance and power efficiency.

收录类别SCI
语种英语
文献类型会议论文
条目标识符http://ir.ia.ac.cn/handle/173211/23879
专题国家专用集成电路设计工程技术研究中心
通讯作者Guo Peng
作者单位1.中科院自动化研究所
2.中国科学院大学
第一作者单位中国科学院自动化研究所
通讯作者单位中国科学院自动化研究所
推荐引用方式
GB/T 7714
Guo Peng,Ma Hong,Guo Ruoshan,et al. Accelerate Convolutional Neural Network with a customized VLIW DSP[C],2018.
条目包含的文件 下载所有文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
ICSESS_2018_paper_29(1173KB)会议论文 开放获取CC BY-NC-SA浏览 下载
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Guo Peng]的文章
[Ma Hong]的文章
[Guo Ruoshan]的文章
百度学术
百度学术中相似的文章
[Guo Peng]的文章
[Ma Hong]的文章
[Guo Ruoshan]的文章
必应学术
必应学术中相似的文章
[Guo Peng]的文章
[Ma Hong]的文章
[Guo Ruoshan]的文章
相关权益政策
暂无数据
收藏/分享
文件名: ICSESS_2018_paper_290.pdf
格式: Adobe PDF
此文件暂不支持浏览
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。