汉语CALL系统韵律诊断关键技术的研究

CASIA OpenIR > 毕业生 > 博士学位论文

	汉语CALL系统韵律诊断关键技术的研究
其他题名	Research on the Key Technology of Automatic Prosody Diagnosis
	朱涛涛
	2011-06-03
学位类型	工学博士
中文摘要	汉语的韵律自动诊断是计算机辅助语言学习系统和口语自动评测系统中的重要核心技术之一。本文根据实际系统的需要，在深入分析当前主流韵律评估技术的基础上面，针对汉语普通话学习过程中韵律诊断关键问题，包括重音、声调、语调的诊断进行了深入的探讨和分析。本文对该领域主要的贡献和创新点有： 1. 本文提出了基于超音段多特征融合的重音诊断方法，采用重音特征包含音高、时长、短时能量、基于TEO算子的子带能量以及基于状态拼接的PLP特征，同时融入了句子间重音的相对性。研究结果表明不同重音声学关联特征有效性按主次分为：时长、子带能量、音高、短时能量、PLP特征。同时，提出分声调建模的方法，来提升重音诊断的性能，建立了一套行之有效的重音诊断方案。 2. 本文首次提出了基于主导集的单字声调聚类的诊断方法，用于重口音条件下声调的诊断。该方法适合特定应用背景，在实际的数据集合上面声调诊断的相关性水平达到了人与人之间相关性。同时，该方法能自动确定类别的个数，用于单字声调诊断，可以提供主要的声调错误并提供调型曲线作为信息反馈。与基于K-means声调聚类检错方法相比，能有效提高声调检错性能。 3. 针对带重口音下连续语音的声调诊断，本文首次提出了基于聚类的连续声调诊断框架，形成了完整的声调诊断体系和技术框架。进行了多层面连续语音声调聚类诊断方法的研究，分别建立了基于Unitone、Bitone、Tritone及其分词融合的声调聚类诊断方法。为了解决Tritone数据稀疏的问题，本文首次提出了基于决策树的声调聚类诊断方法。实验表明：基于决策树的声调聚类诊断有效的提高声调错误诊断的性能，同时能够提供精细的声调反馈信息。 4. 针对汉语计算机辅助语言学习系统中的陈述、疑问、感叹、祈使四种语调的识别和诊断进行了全面深入的研究和分析，本文采用基于超音段特征融合的语调识别和诊断方法，进行了基本特征音高、音长、音强以及高层韵律特征包括停顿、起伏度、重音、调型等韵律特征参数的分析和研究，同时采用SFFS特征选择的方案提升系统的性能。实验验证了该方法的可行性和有效性，获得了较优的结果。研究结果表明，在语调识别和诊断中特征的重要程度为：音高、音长、能量、停顿、重音、起伏度、调型。
英文摘要	Automatic prosody diagnosis is an important part of computer aided language learning and computer-assisted language testing system. This paper focuses on the deep exploration of key issues of prosody diagnosis including stress, tone and intonation in the process of language learning of standard Mandarin based on the analysis of the mainstream methods of current for prosody assessment. The contribution and novelty of this paper are listed as: 1. Proposed the method of diagnosis of stress in Mandarin based on the fusion of super-segmental features including pitch, duration, energy, sub-bands energy based on TEO operator, PLP, the relativity of the stress in utterance is taken into account, also the effectiveness of the different feature are discussed. Experimental results show that the importances of stress features are: duration, sub-bands energy, pitch, energy, PLP. Simultaneously, tone dependent stress model are proposed to improve the performance of the stress detection system, application on real dataset validates its feasibility and effectiveness. 2. Proposed the approach of tone error detection and diagnosis on clustering method of dominant set towards the strong accented single syllable word. For this special application, the cross-correlation between dominant set and human reaches to that of between humans on the real data corpus, and achieves better performance than the traditional k-means based method. Meanwhile, the detection method based on clustering can provide more informative feedbacks denoted by the F0 contours. 3. Towards the issue of tone error detection in strong accented continuous speech, a novel framework and integral system are built based on clustering techniques. For the first time, we carry out the researches on the feasibility and effectiveness from different perspectives with the methods of tone clustering for diagnosis based on Unitone, Bitone and Tritone, besides, the word segmentation is also introduced. To deal with the problem of sparse data on Tritone, decision tree based approach is utilized for collecting the contextual information, experimental results show that it achieves more satisfying performance. Compared with traditional tone error detection method, one distinct advantage lies in that more precisely feedbacks can be provided in the form of F0 contour in the language learning process. Experimental results show that the proposed method is feasible and effectiveness. 4. Build an integral subsystem to carry out int...
关键词	重音检错声调诊断声调聚类语调识别语调诊断主导集决策树 Stress Detection Tone Error Detection Tone Diagnosis Tone Clustering Intonation Recognition Intonation Diagnosis Dominants Set Decision Tree
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/6390
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	朱涛涛. 汉语CALL系统韵律诊断关键技术的研究[D]. 中国科学院自动化研究所. 中国科学院研究生院,2011.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
CASIA_20081801462808（1196KB）			暂不开放	CC BY-NC-SA