基于遗传算法和基于特征参数曲线法的神经网络语音识别新方法研究

CASIA OpenIR > 毕业生 > 硕士学位论文

	基于遗传算法和基于特征参数曲线法的神经网络语音识别新方法研究
	朱山
	1995-06-01
学位类型	工学硕士
中文摘要	人工神经元网络由于它所具有的独特的优点，引起了人们广泛的兴趣并带来大量的研究工作和研究成果。经典的神经网络，尤其是多层感知器，是研究较为成熟的方法之一。但是，采用误差反向传播算法训练的多层感知器寻找全局最优的网络参数一直是一个难题。用于语音识别的静态神经网络，虽然具有很强的分辨力和泛化能力，却很难处理时变的语音信号，其模型固定与信号时长变化之间的矛盾一直是人们努力克服却难以克服的困难。本文针对上述问题进行研究，提出了新的解决办法。首先是将遗传算法引入到神经网络的训练过程中，利用遗传算法能寻找全局最优的特点设法提高神经网络语音识别器的识别率。这一工作是探讨性的工作。其次，针对多层感知器模型固定与语音信号时长变化的矛盾，创新性地提出了基于特征参数曲线的神经网络语音识别方法，包括结合时长信息的特征帧参数法、基于规则分段的折线拟合法和基于动态规划分段的曲线拟合法。这一具有广泛指导意义的方法的提出，极大地减轻了神经网络训练的负担，大大降低了所需神经网络的复杂度，不仅从本质上克服了模型固定与时长变化的固有矛盾，也避免了采用过分依赖模型去逼近信号特征的多模式匹配方法。这种方法更显著的特点是训练、识别时间比其他神经网络方法小一个数量级以上，识别时间与目前的主流方法-HMM方法相差不大，而识别率也与其他神经网络方法接近。此外，本文在分析了常用的利用BP算法训练多层感知器的方法之后，还提出了改进的BP训练算法。实验证明，经过改进的训练算法在保证识别率的前提下，收敛速度大大快于常用的批处理方式的训练算法，因此是一种有效的训练算法。
英文摘要	Because of their unique advantages, artificial neural networks have extensively interested people and have brought a great deal of work and fruits. Classical neural networks, especially multi-layer perceptron, are well researched. However, looking for the global optimal parameters for multi-layer perceptron trained by back-propagation algorithm has always been a difficulty. Although it has strong distinguishing power and generalization ability, static neural network applied for speech recognition has difficulty in dealing with speech signals whose time length is always changing. The contradiction between fixed model and varied time length is the greatest difficulty. It is always stimulating people to overcome it, but is still puzzling people. In view of the above problems, this thesis puts forward new methods to solve them. First, we introduce genetic algorithm into the training process of neural network applied for speech recognition, and utilize genetic algorithm's ability of searching global optimum to try to raise the speech recognition accuracy. Because of the limitation of our computer's performance, this work is only an attempting work. Secondly, in view of the contradiction between the invariance of the multi-layer perceptron and the variance of the speech signal's time length, we creatively bring forth neural network speech recognition method based on feature parameter's curve, including feature frame's parameter method with time length information, broken line fitting method whose segmentation is based on rules, and second-order curve fitting method whose segmentation is based on dynamic programming. This method greatly lightens the burden on neural network and greatly reduces the complexity of neural network. It not only overcomes the intrinsic contradiction between the invariance of the model and the variance of the signal's time length, but also avoids using the multi-model matching method which excessively depends on models to approximate signals. The most remarkable peculiarities of this method are that its training and testing time is fewer than that of other NN-based methods by about one quantitative level, and this speed is comparable with that of HMM method which currently is the prevalent method in speech recognition field, at the same time, the recognition accuracy of this method approaches that of other NN-based methods. Additionally, having analyzed usual multi-layer perception's BP. training modes we figure out an adjusted BP training method. As the experiments proved, the convergence speed of this method is much faster than that of batch-processing method. Therefore, this method is surely an effective method.
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/7120
专题	毕业生_硕士学位论文
推荐引用方式 GB/T 7714	朱山. 基于遗传算法和基于特征参数曲线法的神经网络语音识别新方法研究[D]. 中国科学院自动化研究所. 中国科学院自动化研究所,1995.