CASIA OpenIR  > 毕业生  > 硕士学位论文
Alternative TitleApplication of Chinese Intonation Classification in Speech Recognition
Thesis Advisor徐波
Degree Grantor中国科学院研究生院
Place of Conferral中国科学院自动化研究所
Degree Discipline模式识别与智能系统
Keyword汉语语调 Svm 语音识别 语调识别 Chinese Intonation Svm Speech Recognition Intonation Classification
Abstract目前的语音识别技术在研究和应用领域都取得了很大的进展,口语对话系统就是其中的一个热门应用。语音识别作为对话系统的前端模块,其性能直接影响整个系统的性能。但在传统的语音识别系统中,识别的输出结果只有一串简单的文字,而包含在语音中的一些声学信息就被抛弃了。目前,国际上的主流机构已经对这个领域展开了广泛的研究,其中多集中在英语的对话行为(Dialog Act)方面的研究。针对汉语是有调语言的特点,本文的目的就是为传统的语音识别引擎的输出结果增加语调信息,从而能反应出说话者的情感因素。考虑到汉语的常用语调为陈述、疑问和感叹等,在本文中,我们暂时只对这三种语调进行研究。由于汉语语调的研究还是一个开放性的问题,本文采取了先进行单个特征的研究,然后选择区分性强的特征的方法,最终提出了一种鲁棒的语调识别的方法。在特征选择和融合的任务中选择支持向量机作为分类器。在我们的实验中,先用语音识别的基本方法对待测语音进行识别,然后使用识别结果代替标注的文本。这样处理的目的在于在真实的语音识别环境中模拟语调分类。本实验中的实验数据是精心设计的,包括三种语调的句子,一共是大约4700句。实验结果表明,在三种语调的分类任务中,我们的系统达到了84.13%的识别率。
Other AbstractRecent years have seen great improvements in the performance of automatic speech recognition, and Dialog System is one of applications of these performances. But in conventional speech recognition system, only a plain text is presented as the final result, and all acoustic information of speech are cutoff. The aim of this publication is to add intonation information to traditional output of speech recognition engine, which is believed to reflect the emotion and intention of speaker. In this paper, we propose a robust approach to classify several kinds of intonations, e.g. declarative, interrogative, exclamatory, etc. Since it is still an open question on how to describe intonations, different kinds of features are investigated here to choose the most effective features for intonations classification. Support Vector Machine (SVM) is used as the classifier to perform the task of feature selection and combination. In our experiment, we address the speech recognition based methods, and use recognized results replace the transcribed text. Our goal is to simulate intonation classification in the real speech recognition. The speech materials used in this experiment were well designed includes three intonations, total about 4700 sentences. Experimental results show that our system can achieves the accuracy of (84.13%) for the task of three types of Chinese intonation classification.
Other Identifier200818014628068
Document Type学位论文
Recommended Citation
GB/T 7714
徐景阳. 语调识别在语音识别中的应用[D]. 中国科学院自动化研究所. 中国科学院研究生院,2011.
Files in This Item:
File Name/Size DocType Version Access License
CASIA_20081801462806(1237KB) 暂不开放CC BY-NC-SAApplication Full Text
Related Services
Recommend this item
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[徐景阳]'s Articles
Baidu academic
Similar articles in Baidu academic
[徐景阳]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[徐景阳]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.