深度脉冲神经网络转换学习算法研究

	深度脉冲神经网络转换学习算法研究
	陈睿智
	2019-06
页数	122
学位类型	博士
中文摘要	脉冲神经网络在低功耗、生物可解释性以及脑机交互实时应用等方面具有优越的性能和广泛的应用前景，因此在类脑计算中占据着重要的地位。但是，脉冲序列的不可微性以及网络中复杂的动态特性使得脉冲神经网络的训练十分困难。目前脉冲神经网络的学习算法还无法取得和深度卷积神经网络相似的性能，这会极大地限制脉冲神经网络的应用。针对该问题，一种可行的解决方案是脉冲神经网络模型转换。该方案首先构建一个结构和规模上都与卷积神经网络类似的脉冲神经网络；然后通过卷积神经网络成熟的训练技术获得高性能卷积神经网络模型；最后通过特定的转换算法将卷积神经网络模型参数转换成脉冲神经网络参数，从而获得与卷积神经网络类似性能的脉冲神经网络模型。现有的脉冲神经网络模型转换算法研究已经取得了一些非常有前景的成果。但由于两种神经网络之间内在机制的差异，脉冲神经网络模型无法与卷积神经网络模型一一对应，转化后的脉冲神经网络在识别精度、收敛时间等方面与实际应用之间还存在差距，同时现有算法只能获得浅层的脉冲神经网络。本论文针对上述问题，通过详细研究讨论不同转换算法与所得脉冲神经网络在识别精度、收敛时间等方面的内在关系，提出三种转换算法，获得脉冲神经网络的深度结构，为构建高效高性能的深度脉冲神经网络提供重要理论基础和参考设计。本文的主要工作与贡献如下： (1)多强度深度脉冲神经网络模型转换算法本文针对脉冲饱和问题，提出一种多强度的脉冲神经网络及其动态剪枝算法，通过降低神经元输出脉冲强度的限制，可以获得多强度大规模深度脉冲神经网络，并提高转换网络的收敛速度。具体来说，首先提出一种多强度的脉冲神经元模型，降低对神经元输出脉冲强度的限制；其次提出多强度脉冲神经网络结构，支持具有深度结构的脉冲神经网络转换；最后，针对深度脉冲神经网络中的大量运算冗余，提出3种脉冲神经网络压缩算法，在保持逼近精度不变的条件下，可以移除原始多强度神经网络中85%的运算操作。实验结果表明，本文提出的算法，在MNIST和CIFAR10数据集上，分别获得99.57%和94.01%的识别精度，较同期最好结果分别提升0.13%和3.16%；并且该网络可以在80个时间步内收敛，比同期的模型转换算法加速3.75倍。 (2)低延迟深度脉冲神经网络模型转换算法本文提出限制输出预训练算法和错误脉冲抑制算法，在获得具有深度结构的转换脉冲神经网络的同时显著提高转换网络的收敛速度。限制输出预训练算法，通过在卷积神经网络训练过程中进行动态参数规范化，解决脉冲神经网络逼近过程中的脉冲饱和问题；错误脉冲抑制算法，将错误脉冲的抑制问题抽象化为一个线性规划问题，大大减少转换网络中的错误脉冲。实验结果表明，使用这两种算法的转换脉冲神经网络可在30个时间步内收敛，在CIFAR10数据集上取得的最佳逼近精度为94%。 (3)基于反向传播的极低延迟深度脉冲神经网络转换学习算法本文提出基于反向传播的极低延迟深度脉冲神经网络转换学习算法，利用模型转换算法中两种神经网络之间的联系，使用反向传播算法学习转换网络中脉冲序列的时序信息中的有效特征，获得具有深度结构的脉冲神经网络的同时进一步降低转换网络的收敛时间。具体来说，首先分析总结使用反向传播算法训练深度脉冲神经网络所需满足的三个严苛条件；其次论证模型转换参数可以使脉冲神经网络获得在空间域上处理信息的能力，并设计一种参数初始化算法，使反向传播算法支持更深的脉冲神经网络训练，同时减小反向传播算法的训练迭代次数；最后，提出误差最小化算法以及修改的损失函数，进一步提升反向传播算法的性能。实验结果表明，本部分提出的算法，在MNIST和CIFAR10数据集上，分别将网络收敛时间进一步降低到4和10个时间步，比算法(2)分别提高7.5倍和3倍，同时保持较高的识别精度(分别为99.44%和 91.52%)。
英文摘要	Spiking neural networks (SNNs) have excellent performance and broad application prospects in low power hardware, algorithmic interpretability and real-time applications of brain–machine interfaces, so SNNs play an important role in neuromorphic computing. However, the SNN learning algorithms still can not achieve the similar performance as deep convolutional neural networks (CNN) due to the complex dynamics and non-differentiable spike events in these networks. To solve the problem, a feasible solution is converting CNNs into SNNs (CNN-SNN conversion algorithm). In this solution, a SNN similar to the CNN in structure and model size is firstly constructed; then a high performance CNN is obtained by using the mature CNN training algorithms; finally, the CNN weight parameters are converted into the SNN architecture to get a converted SNN with high performance. Some promising results have been achieved in the researches of the existing CNN-SNN conversion algorithms. However, there is still a gap of the SNN practical applications in the accuracy and the convergence time aspects. Moreover, the existing conversion algorithms can only convert shallow SNNs. In view of these above problems, we study the inherent relationship between the conversion algorithms and the obtained SNNs in the accuracy, the convergence time and other aspects, then propose three conversion algorithms to provide an important theoretical basis and reference design for the construction of high performance SNNs. The main contributions of this dissertation are: (1) Deep multi-strength SNN conversion algorithm This dissertation presents a deep multi-strength conversion algorithm with dynamic pruning, through relaxing the restriction of the spike strength to obtain deep multi-strength SNN (M-SNN) and to decrease the convergence time of the converted SNNs. Specifically, a multi-strength spiking neuron model is firstly proposed. Then, a M-SNN structure is introduced to support large scale deep SNNs. Finally, three aggressive dynamic pruning techniques are applied to reduce the computational operations by 85% while maintaining the same accuracy. Experiments show that our algorithm achieves 99.57% and 94.01% accuracy on the MNIST and the CIFAR10 dataset respectively, outperforming the best results for the same period with 0.13% and 3.16% accuracy. Meanwhile, the convergence time of the M-SNN is 80 time steps, with $3.7\times$ convergence speedup. (2) Low latency deep SNN conversion algorithm This dissertation presents a restricted output training method and a false spike inhibition method, observably decreasing the convergence time. The restricted output training method normalizes the converted weights dynamically in the CNN training phase, solving the firing rate saturation problem. The false spike inhibition method reduces the false spikes through transforming the inhibition problem to a linear programming problem. Experiments show that the converted SNN can converge within 30 time steps, and the accuracy is 94% on the CIFAR10 dataset. (3) Very low latency deep SNN conversion learning algorithm with the back-propagation process This dissertation proposes a very low latency deep SNN conversion learning algorithm based on back-propagation. This algorithm brings in the back-propagation algorithm to learn the valid features in the spike trains in the converted SNNs through the connection between CNNs and SNNs, and further reduces the convergence time of the converted SNNs. More concretely, three severe conditions in training deep SNNs with back-propagation are analyzed; then the converted model parameters are proved to be capable of handling the spatial information in the converted SNNs and the weight initialization algorithm is introduced to support the deeper SNN training with back-propagation and decrease the train epochs of back-propagation; finally, an error minimization method and a modified loss function are presented to further improve the training performance. In experiments, these three algorithms achieve the accuracy of 99.44% and 91.52% on the MNIST dataset and the CIFAR10 dataset respectively, and the convergence time steps are 4 and 10.
关键词	脉冲神经网络反向传播算法脉冲神经网络学习算法
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/23930
专题	国家专用集成电路设计工程技术研究中心
推荐引用方式 GB/T 7714	陈睿智. 深度脉冲神经网络转换学习算法研究[D]. 中国科学院自动化研究所. 中国科学院自动化研究所,2019.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
Thesis.pdf（4129KB）	学位论文		开放获取	CC BY-NC-SA