英文摘要 | In the area of computer assisted language learning and testing, automatic mispronunciation detection and diagnosis are the key techniques. Based on the deep analysis of technical aspects and existing achievements, in connection with the spoken English test for large scale crowd, this thesis will perform systemic researches on the mispronunciation detection and diagnosis technology, and the corresponding contributions and innovation highlights are summarized as follows: 1)Data resource is the basis for the study. Towards the excavation and usage of massive data, this thesis has constructed lots of special corpora, including the mispronunciation corpus, stress mispronunciation corpus, etc. Meanwhile, in order to provide a universal evaluation platform, the thesis has done a lot on the performance evaluation system. 2)Towards the problem of substitution mispronunciation detection, on the basis of traditional HMM based GOP, this thesis has opened a new path, that is, in terms of hypothesis testing and classification detection, many classifiers have been introduced into mispronunciation detection task, and a series of novel mispronunciation detection methods has been proposed, such as the UBM based GMM method (GMM-UBM), the GLDS kernel based SVM method (GLDS-SVM), TRAP feature based neural network method (TRAP-NN), and so on. For GLDS-SVM, the thesis proposed a new multi-model fusion strategy for model training, in order to make full use of samples and solve the problem of data unbalance. The introduction of TRAP improved the description ability of pronunciation quality, and TRAP-NN achieved the best performance in the current existing universal single mispronunciation detection systems, the EER values are 8.73%, 14.17% and 28.44% for simulation set, intended set and natural set, respectively. 3)Towards the problem of substitution, deletion and insertion mispronunciation detection, a concept of generalized pronunciation space has been proposed, and brought the various mispronunciation cases into a unified framework. Besides, through the summarization of mispronunciation patterns in high-volume corpus, the thesis utilized the word-dependent rules in mispronunciation detection network, in order to avoid the shortcomings of the conventional universal rules. The rule based method is in favor of the automatic generation of feedbacks, while its limitations are also obvious. The experimental results show, in the support of massive data from special districts, the ... |
修改评论