CASIA OpenIR  > 毕业生  > 硕士学位论文
基于遗传算法和基于特征参数曲线法的神经网络语音识别新方法研究
朱山
学位类型工学硕士
导师陈道文
1995-06-01
学位授予单位中国科学院自动化研究所
学位授予地点中国科学院自动化研究所
学位专业模式识别与智能系统
摘要人工神经元网络由于它所具有的独特的优点,引起了人们广 泛的兴趣并带来大量的研究工作和研究成果。经典的神经网络,尤 其是多层感知器,是研究较为成熟的方法之一。但是,采用误差反 向传播算法训练的多层感知器寻找全局最优的网络参数一直是一个 难题。用于语音识别的静态神经网络,虽然具有很强的分辨力和泛 化能力,却很难处理时变的语音信号,其模型固定与信号时长变化 之间的矛盾一直是人们努力克服却难以克服的困难。 本文针对上述问题进行研究,提出了新的解决办法。首先是将 遗传算法引入到神经网络的训练过程中,利用遗传算法能寻找全局 最优的特点设法提高神经网络语音识别器的识别率。这一工作是探 讨性的工作。 其次,针对多层感知器模型固定与语音信号时长变化的矛盾, 创新性地提出了基于特征参数曲线的神经网络语音识别方法,包括 结合时长信息的特征帧参数法、基于规则分段的折线拟合法和基于 动态规划分段的曲线拟合法。这一具有广泛指导意义的方法的提 出,极大地减轻了神经网络训练的负担,大大降低了所需神经网络 的复杂度,不仅从本质上克服了模型固定与时长变化的固有矛盾, 也避免了采用过分依赖模型去逼近信号特征的多模式匹配方法。这 种方法更显著的特点是训练、识别时间比其他神经网络方法小一个 数量级以上,识别时间与目前的主流方法-HMM方法相差不大,而 识别率也与其他神经网络方法接近。 此外,本文在分析了常用的利用BP算法训练多层感知器的方 法之后,还提出了改进的BP训练算法。实验证明,经过改进的训练 算法在保证识别率的前提下,收敛速度大大快于常用的批处理方式 的训练算法,因此是一种有效的训练算法。
其他摘要Because of their unique advantages, artificial neural networks have extensively interested people and have brought a great deal of work and fruits. Classical neural networks, especially multi-layer perceptron, are well researched. However, looking for the global optimal parameters for multi-layer perceptron trained by back-propagation algorithm has always been a difficulty. Although it has strong distinguishing power and generalization ability, static neural network applied for speech recognition has difficulty in dealing with speech signals whose time length is always changing. The contradiction between fixed model and varied time length is the greatest difficulty. It is always stimulating people to overcome it, but is still puzzling people. In view of the above problems, this thesis puts forward new methods to solve them. First, we introduce genetic algorithm into the training process of neural network applied for speech recognition, and utilize genetic algorithm's ability of searching global optimum to try to raise the speech recognition accuracy. Because of the limitation of our computer's performance, this work is only an attempting work. Secondly, in view of the contradiction between the invariance of the multi-layer perceptron and the variance of the speech signal's time length, we creatively bring forth neural network speech recognition method based on feature parameter's curve, including feature frame's parameter method with time length information, broken line fitting method whose segmentation is based on rules, and second-order curve fitting method whose segmentation is based on dynamic programming. This method greatly lightens the burden on neural network and greatly reduces the complexity of neural network. It not only overcomes the intrinsic contradiction between the invariance of the model and the variance of the signal's time length, but also avoids using the multi-model matching method which excessively depends on models to approximate signals. The most remarkable peculiarities of this method are that its training and testing time is fewer than that of other NN-based methods by about one quantitative level, and this speed is comparable with that of HMM method which currently is the prevalent method in speech recognition field, at the same time, the recognition accuracy of this method approaches that of other NN-based methods. Additionally, having analyzed usual multi-layer perception's BP. training modes we figure out an adjusted BP training method. As the experiments proved, the convergence speed of this method is much faster than that of batch-processing method. Therefore, this method is surely an effective method.
馆藏号XWLW346
其他标识符346
语种中文
文献类型学位论文
条目标识符http://ir.ia.ac.cn/handle/173211/7120
专题毕业生_硕士学位论文
推荐引用方式
GB/T 7714
朱山. 基于遗传算法和基于特征参数曲线法的神经网络语音识别新方法研究[D]. 中国科学院自动化研究所. 中国科学院自动化研究所,1995.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[朱山]的文章
百度学术
百度学术中相似的文章
[朱山]的文章
必应学术
必应学术中相似的文章
[朱山]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。