Because of their unique advantages, artificial neural networks have extensively interested people and have brought a great deal of work and fruits. Classical neural networks, especially multi-layer perceptron, are well researched. However, looking for the global optimal parameters for multi-layer perceptron trained by back-propagation algorithm has always been a difficulty. Although it has strong distinguishing power and generalization ability, static neural network applied for speech recognition has difficulty in dealing with speech signals whose time length is always changing. The contradiction between fixed model and varied time length is the greatest difficulty. It is always stimulating people to overcome it, but is still puzzling people. In view of the above problems, this thesis puts forward new methods to solve them. First, we introduce genetic algorithm into the training process of neural network applied for speech recognition, and utilize genetic algorithm's ability of searching global optimum to try to raise the speech recognition accuracy. Because of the limitation of our computer's performance, this work is only an attempting work. Secondly, in view of the contradiction between the invariance of the multi-layer perceptron and the variance of the speech signal's time length, we creatively bring forth neural network speech recognition method based on feature parameter's curve, including feature frame's parameter method with time length information, broken line fitting method whose segmentation is based on rules, and second-order curve fitting method whose segmentation is based on dynamic programming. This method greatly lightens the burden on neural network and greatly reduces the complexity of neural network. It not only overcomes the intrinsic contradiction between the invariance of the model and the variance of the signal's time length, but also avoids using the multi-model matching method which excessively depends on models to approximate signals. The most remarkable peculiarities of this method are that its training and testing time is fewer than that of other NN-based methods by about one quantitative level, and this speed is comparable with that of HMM method which currently is the prevalent method in speech recognition field, at the same time, the recognition accuracy of this method approaches that of other NN-based methods. Additionally, having analyzed usual multi-layer perception's BP. training modes we figure out an adjusted BP training method. As the experiments proved, the convergence speed of this method is much faster than that of batch-processing method. Therefore, this method is surely an effective method.
修改评论