计算机辅助口语评估及诊断报告

CASIA OpenIR > 毕业生 > 硕士学位论文

	计算机辅助口语评估及诊断报告
其他题名	computer-aided assessment of spoken language and the generation of diagnostic report
	阳曦
	2008-05-20
学位类型	工学硕士
中文摘要	随着国际化交流日益频繁，多语人才的需求越来越迫切，口语测试的普及率、公平性受到广泛关注。面对传统的口语测试所带来的时空受限、组织成本高、评估主观性强、反馈信息单一等难题，越来越多的计算机技术逐渐被应用到教育领域，从而开辟了计算机辅助语言学习(Computer-Aided Language Learning, CALL)这一新学科。伴随着这一学科的发展，各种集成了多项人工智能(Artificial Intelligence, AI)技术的CALL系统应运而生，为口语教育提供了一种全新的学习和测试模式。面对口语测试的实际需求，本文以利用计算机的客观性和高速、复杂的计算能力对口语进行自动评估并提供一份科学的诊断报告为目标，在对CALL系统的主要组成模块及其关键技术进行介绍的基础上，从以下两个方面入手进行了深入地研究和探索： (1) 针对现有的各种发音质量自动评分算法本身无法单独提供较为可靠的评估结果的情况，本论文提出了一种对多种自动评分算法的得分进行数据融合以提高自动评分与人工评分一致性的新方法，并运用多元线性回归(Multi Linear Regression, MLR)和反向传播(Back Propagation, BP)神经网络等数据融合算法进行了实现。在标准口语测试数据集上的试验结果显示：相比融合前的任意一种机器评分，融合后的机器评分与人工评分之间具有更高的相关性和更小的误差，从而验证了数据融合方法的有效性和可行性。 (2) 针对现有口语测试方法无法提供系统、及时且信息量丰富的个性化诊断报告的缺点，以CALL系统为基础，通过利用系统评分过程所获得的信息，本论文构建了一个针对考生口语特点的个性化诊断报告生成系统。该系统不仅能够对用户的整体口语水平做出综合性评估，还能够从发音、韵律等多方面进行音素级、字词级的错误定位和诊断，并针对典型错误提出改善意见。这种反馈模式极大地丰富了口语诊断信息的内容，能够对考生之间的水平差异做出比较精确的区分，对考生口语水平的进一步提高具有较强的指导意义。
英文摘要	As the international communication is becoming highly frequenter, the need of multilingual people is getting more and more urgent, so the popularization and justice of spoken language tests attract far-ranging attentions. However, traditional spoken language tests have many problems, like time and space limitation, high cost of organization, subjective assessment and poor feedback. In order to deal with those problems, more and more computer technologies are applied to the education field. Consequently, Computer-Aided Language Learning comes into being. Along with the development of this new discipline, all kinds of CALL systems that integrate many artificial intelligence technologies are brought forward. Those systems provide a whole new pattern of learning and testing for spoken language education. Facing the practical demands of spoken language testing, this dissertation focuses on the automatic assessment of spoken language and providing a scientific diagnostic report by the objectivity and fast, complicated calculation skills of computer. Based on a general introduction of CALL systems’ primary modules and their key technologies, the work of this dissertation mainly includes the following two contributions: (1) Because the state-of-art automatic scoring algorithms of pronunciation quality can not offer comparatively reliable assessment results lonely, this dissertation presents a new method that using the fusion of scores of many automatic scoring algorithms to increase the correlation between the machine scores and human scores, and carries out this thought by Multi-Linear Regression and Back Propagation Neural Network. The experiment result on a standard spoken language test database shows: compared to the any kind of machine scores before fusion, scores after fusion get higher correlations and smaller errors between the human scores. Thus, it proves the feasibility and validity of fusion. (2) Nowadays, spoken language tests can not provide systematic, timely and full of information individual diagnostic report. Facing that disadvantage, this dissertation designs an individual diagnostic report generation system according to the testers’ spoken characteristics. It is based on a CALL system and uses the information from the system scoring processing. The system can not only provide a summative assessment of the overall level of testers’ spoken language, but also do pronunciation and prosody error localization and diagnosis in phone-level and word-level. Besides, it can offer improvement advices according to typical errors. This kind of feedback pattern riches the content of spoken language diagnostic information extremely. It can distinguish the subtle differences between testers, and is very significant for the farther improvement of testers’ spoken language.
关键词	计算机辅助语言学习口语测试数据融合诊断报告 Computer-aided Language Learning Spoken Language Testing Data Fusion Diagnostic Report
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/7425
专题	毕业生_硕士学位论文
推荐引用方式 GB/T 7714	阳曦. 计算机辅助口语评估及诊断报告[D]. 中国科学院自动化研究所. 中国科学院研究生院,2008.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
CASIA_20052801462803（846KB）			暂不开放	CC BY-NC-SA