CASIA OpenIR  > 毕业生  > 硕士学位论文
众核类脑处理器关键技术研究
李千鹏
2024-05-13
页数82
学位类型硕士
中文摘要

近几年,随着计算机科学和集成电路技术的不断发展,芯片算力越来越大,以人工神经网络为代表的人工智能得到快速发展,赋能各行各业的智能化变革。如在计算机视觉、自然语言处理、内容生成、机器人控制、自动驾驶等领域和任务取得较好的效果。但由于算法模型的规模不断扩大,人工智能的高能耗问题难以忽视。类脑智能/计算借鉴生物大脑生理结构与信息处理机制,从形态结构到计算过程全方面模拟大脑,达到更高的智能水平和更低的能耗。作为类脑计算的主要实现形式,脉冲神经网络(SNN, Spiking Neural Network)具有时空稀疏的二值脉冲事件驱动特性,并因其节能运算的模式而受到广泛关注。然而由于SNN需要处理时序信息并存储模型的状态变量,现有的通用计算平台或加速器存在资源开销大、计算延迟高、模型适应性差等问题。为此,本文分析了类脑模型的计算特点,设计一款兼顾灵活性与低功耗的众核类脑处理器,主要工作如下:

提出一种基于事件驱动工作模式的层次化众核类脑处理器,包括神经元核心、调度核心、皮质柱核心和路由器。处理器基于类脑指令集能够支持丰富的模型算法;调度核心提供了脉冲事件的调度策略;路由器和片间通讯接口保障芯片的可拓展能力。

针对芯片资源约束及算法并行计算条件下的模型映射问题,初步研究了计算模型、学习算法、网络拓扑的基本映射流程,提出多种神经元扇入、扇出拓展方案,以及模型分布式存储与计算的映射方案。并基于硬件架构和软件映射方案,设计网络模型映射器、行为级的指令集和芯片系统仿真器,加速片上算法模拟和验证。

最后,搭建了类脑软硬件平台包括: 类脑软件框架、处理器硬件仿真器、基于Xilinx VU13P FPGA和DVS相机的硬件环境,基于平台验证了多项应用。在多种数据集的推理分类应用上,类脑处理器能够取得和GPU相近的分类准确率,能量延迟积降低了7至143倍;在模拟复杂神经元模型时,计算相对误差不超过5%;在MNIST数据集上能够实现96.7%的片上学习分类准确率;在解码脑机接口信号时,实现相关系数0.58的解码处理能力;在真实场景下,演示了基于DVS相机的手势分类。所设计的类脑处理器在实验验证方面,表现了良好的可用性以及可拓展性。SMIC28nm工艺下后端评估结果为:面积为431mm2,功耗为0.54W,每次突触操作需要1.08pJ,并与当前先进的类脑处理器进行了对比。

英文摘要

In recent years, with the continuous development of computer science and integrated circuit technology, the chip arithmetic power is getting bigger and bigger, and artificial intelligence represented by artificial neural networks has been developing rapidly, empowering intelligent changes in various industries. Such as in computer vision, natural language processing, content generation, robot control, automatic driving and other fields and tasks to achieve better results, but due to the increasing scale of the algorithmic model, the high energy consumption of artificial intelligence is difficult to ignore. Brain-inspired intelligence or computing draws on the physiological structure and information processing mechanism of the biological brain, and simulates the brain in all aspects from morphological structure to computational process to achieve a higher level of intelligence and lower energy consumption. As the main form of brain-inspired computation, Spiking Neural Network (SNN) has spatio-temporally sparse binary spike event-driven characteristics and has attracted much attention due to its energy-efficient computing paradigm. However, since SNN needs to process temporal information and store model state variables, existing general-purpose computing platforms or accelerators suffer from high resource overhead, high computational latency, and poor model adaptation. To this end, this paper analyzes the computational characteristics of brain-inspired models and designs a many-core brain-inspired processor that balances flexibility and low power consumption. The main tasks are as follows:

Drawing on the structural characteristics and information processing mechanism of the cerebral cortex network, a hierarchical many-core brain-inspired processor based on the event-driven working mode is proposed, including a neuron core, a scheduling core, a cortical column core and a router. The processor can support rich model algorithms based on the brain-inspired instruction set, the scheduling core provides a scheduling strategy for spike events, and the router and inter-chip communication interface ensure the scalability of the chip.

Aiming at the model mapping problem under chip resource constraints and algorithm parallel computing conditions, we initially study the basic mapping process of computational models, learning algorithms, and network topology, and put forward a variety of neuron fan-in and fan-out expansion schemes to complete the mapping scheme of distributed storage and computation of models. Based on the hardware architecture and software mapping scheme, we also design network model mapping tools, behavioral-level instruction set and system-on-chip simulator to accelerate on-chip algorithm simulation and verification.

Finally, a brain-inspired hardware and software platform was built, including: brain-inspired software framework, processor hardware simulator, hardware test platform based on Xilinx VU13P FPGA and DVS camera. Multiple applications were mapped and verified based on the platform. In the inference and classification application of multiple datasets, the brain-inspired processor is able to achieve classification accuracy similar to that of GPU, with the energy delay product reduced by a factor of 7 to 143; when simulating complex neuron models, the relative error of computation is no more than 5%; it can achieve an on-chip learning classification accuracy of 96.7% on the MNIST data set; when decoding brain-computer interface signals, it achieves a decoding processing capability with a correlation coefficient of 0.58; demonstrated gesture classification based on DVS camera in a real scene. The designed brain-inspired processor shows good usability as well as scalability in experimental validation.The results of the back-end evaluation in SMIC 28nm process are: area of 431mm2, power consumption of 0.54W, and 1.08pJ per synaptic operation, and compared with the current state-of-the-art brain-inspired processors.

关键词类脑处理器 体系结构 类脑计算 脉冲神经网络
语种中文
文献类型学位论文
条目标识符http://ir.ia.ac.cn/handle/173211/57299
专题毕业生_硕士学位论文
推荐引用方式
GB/T 7714
李千鹏. 众核类脑处理器关键技术研究[D],2024.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[李千鹏]的文章
百度学术
百度学术中相似的文章
[李千鹏]的文章
必应学术
必应学术中相似的文章
[李千鹏]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。