基于GMM的声音信号分类器研究

CASIA OpenIR > 毕业生 > 硕士学位论文

	基于GMM的声音信号分类器研究
其他题名	Sound Classificaition Based on GMMs
	王志强
	2003-06-02
学位类型	工学硕士
中文摘要	声音信号的分类是模式识别技术的一个重要应用方向。近年来，它从语音识别技术等具体的模式识别应用技术中汲取了大量的有益经验，并根据自己的特色，发展出了有自己特色的研究成果。与此同时，这一技术也存在着许多问题，集中于怎样从声音信号中提取出表征信号本身的比较稳定的特征信息，以及怎样改进训练方法使其针对现有的特征改进训练方法得到推广性更强的模型。本文将详细介绍作者攻读硕士学位期间在基于高斯混合模型（GMMs）的声音信号分类器方面的一些工作。我们首先研究了今年来基于声纹的说话人识别技术，特别是文本无关的说话人识别。在分析了对说话人识别效果影响的一些因素的基础上，建立了一个文本无关的说话人识别系统。并针对现有建模技术对距离度量的不合理性，以及对数据间共有信息的忽视，提出了基于方差共享的结构聚类方法，挖掘了相近类中数据所包含的共有信息，更为有效地刻画了数据的分布状况。这个方法在一定程度上解决了数据稀疏问题。其次，基于GMM的说话人识别系统框架和技术，研制开发了油田管道声音信号监测系统。该系统在实际使用中取得了较好的效果。在系统的开发过程中，仔细分析了MFCC特征提取方法对声音信号特征的提取，并针对实际信号对该方法作了改进。同时，针对实际工程的需要，实现了定点程序的声音信号分类系统，并将识别过程中求取所有高斯组件的概率打分和作了改进，提高了识别速度，识别率基本保持不变。再次，在油田管道声音信号分类系统开发过程中，采集并建立了大量的现场信号，为声音信号的进一步分析提供了数据保障。本文提出的各种算法和以及改进都在实际信号中作了大量测试验证其效果，并在试验后作了有针对性的分析。
英文摘要	Sound classification system is one important application for pattern recognition. Recent years, sound classification system benefits a lot from the research in the field of pattern recognition such as automatic speech recognition, etc, and is becoming a reasonable selection as independent research aspect. However, the technique of sound classification has many questions to be solved, which focuses on how to get more robust feature from sound signal and how to improve the training process to get robust models based on the current technique of feature extraction. This paper will present in detail the author's research work on these problems in the courts of study as a master candidate. Firstly, we do some work to understand the speaker recognition technique based on speech signal, especially on text-independent speaker recognition. After the analysis of some factors in speaker recognition, we set up one text-independent speaker recognition system. We find some unreasonable question on distance measurement and the neglect of the mutual information between data of different sound classes. So based on these factors, we proposed a novel method, namely Covariance-tied Clustering Method, to mining the mutual information between data of different classes, and effectively take in much information of data distribution for distance measurement. This method can avoid data sparseness to some purpose. Secondly, based on the GMMs system frame of text-independent speaker rec6gnition, we developed the system of sound monitor for oil pipeline. This system gets better performance in the application. In the development of oil system, we analyzed the MFCC method of feature extraction for speech signal, and improve it for the oil sound. In the same time, in order to fit the realization in hardware, we achieved the integer program for the sound classification system, and modified the recognition process to save more time while the recognition rate is almost invariable. Thirdly, during the course of developing the sound monitor for oil pipeline, we collect the sounds of oil pipeline and constructed a corpus of actual sound and ensure the farther analysis of pipeline sounds.
关键词	语音识别技术模式识别高斯混合模型(Gmms) Mfcc Sound Classification System Pattern Recognition Automatic Speech Recognition Text-independent Speaker Recognition Covariance-t
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/6845
专题	毕业生_硕士学位论文
推荐引用方式 GB/T 7714	王志强. 基于GMM的声音信号分类器研究[D]. 中国科学院自动化研究所. 中国科学院研究生院,2003.