自动发音评估与诊断技术研究

CASIA OpenIR > 毕业生 > 博士学位论文

	自动发音评估与诊断技术研究
其他题名	Research on Automatic Pronunciation Asseessment and Diagnosis
	徐爽
	2009-06-29
学位类型	工学博士
中文摘要	自动发音评估和诊断技术是计算机辅助语言学习系统和计算机辅助语言测试系统的重要核心技术。本文根据实际系统的需要，在深入分析当前主流发音评估和诊断技术的特点和面临的主要问题的基础上，针对发音内容确认、发音错误自动诊断以及整体发音质量自动评估等问题进行了深入的探讨和分析。本文在该领域的贡献和创新点主要有： 1.提出了基于音素混淆的发音内容确认方法，并将其应用于面向孤立词和短语的发音错误诊断中。该方法采用发音变异强制对齐进行音素识别，并基于音素混淆度矩阵计算发音内容的可信度，不仅能够有效起到系统“防火墙”的作用，提高发音错误诊断系统的实用性，而且在速度和灵活性上优于广泛使用的基于后验概率的发音内容确认方法。 2.针对主流的基于后验概率方法的不足，研究提出了基于多知识源的发音错误诊断。首先引入基于发音空间的发音错误诊断方法，采用多维后验概率特征来描述发音空间，充分利用音素声学模型间的混淆知识。其次，采用发音错误先验知识来修正后验概率的计算，提出基于受发音错误先验知识约束的发音空间方法。最后，引入CMLLR说话人自适应算法，充分利用说话人的先验知识进行发音错误诊断。实验结果表明，上述基于多知识源的方法能够有效提高发音错误诊断系统的性能。 3.建立了包含语音识别预处理、评估特征提取以及评估三个模块的基于多特征融合的发音质量自动评估系统，在主流评估特征的基础上引入了词匹配得分和段匹配得分两种内容完整性评估特征。实验结果表明，在面向段落朗读的发音质量评估任务中，所提出的内容完整性特征起到非常重要的作用，评估系统的整体性能达到与人工专家性能接近的水平，目前已经在口语考试评估项目中得到应用。
英文摘要	Automatic pronunciation assessment and diagnosis is an important part of many computer-aided language learning systems and computer-assisted language testing systems. This paper carries out researches on pronunciation content verification, automatic pronunciation error diagnosis and automatic assessment on global pronunciation quality. The work of this paper mainly includes the following contributions: 1.Proposed a pronunciation content verification method based on phoneme confusion, which was applied in the pronunciation error diagnosis for isolated words and phrases. The new method used pronunciation variation force alignment to carry out phoneme recognition and computed confidence mesure by using phoneme confusion matrix. The experimental results showed that this method not only can effectively play the system “firewall” role, but also is superior to the mainstream method based on posterior probability in terms of speed and flexibility. 2.Against the disadvavantages of the mainstream method based on posterior probability, methods based on multi-knowledge sources were proposed. Firstly, the method based on pronunciation space was introduced, which used multi-dimensional posterior probability features to describe pronunciation space of users, taking full use of the confusion knowledge of phoneme models. Secondly, introduced prior knowledge of pronunciation errors to get modified multi-dimensional posterior probability features and proposed a new method based on restricted pronunciation space for error diagnosis. Lastly, CMLLR speaker adaptation was introduced, which took full use of prior speaker knowledge of users. The experimental results showed that the above approaches can effectively improve the performance of pronunciation error diagnosis. 3.Built a system to carry out automatic pronunciation quality assessment. The system was for paragraph reading, including the speech-recognition-based pre-processing module, the feature extraction module, and the assessment module. In the feature extraction module, the word match score (WMS) and the segment match score (SMS) are used for assessing the integrity of the content. The experimental results showed that the two kinds of features played very important roles in the assessment for paragraph reading. The system was comparable with human raters in overall performances and had been used in actual spoken evaluation projects.
关键词	发音错误诊断发音质量评估发音内容确认后验概率发音空间发音错误先验知识说话人自适应 Pronunciation Error Diagnosis Pronunciation Quality Assessment Pronunciation Content Verification Posterior Probability Pronunciation Space Prior Knowledge Of Pronunciation Errors Speaker Adaptation
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/6222
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	徐爽. 自动发音评估与诊断技术研究[D]. 中国科学院自动化研究所. 中国科学院研究生院,2009.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
CASIA_20041801462807（887KB）			暂不开放	CC BY-NC-SA