For the spoken English test for large scale crowd, this thesis will perform systemic researches on the assessment, diagnosis and feedback technology of spoken language, and the corresponding contributions and innovation highlights are summarized as follows: 1)For ``repetition'' detection in constrained type, the major problem is matching and existence of noise. To deal with that, a second pass recognition grammar built with a series of fine-grained re-matching models is presented, which is fused in frame level. After generating candidate reparandum and repair, repair filtering is performed in different segmental levels and order assumptions. i.e. K-difference filter in regular order and N-gram filter in random order. Result indicates that introducing re-matching and small modeling unit are two keys coping with speech in miscues, and repair filtering in random order is superior to regular order, showing that reparandum is better in detail deemed as the~``rough copy'' of pseudo-random occurrence of sub-word; 2) For ``repetition'' detection in open-ended type, research is conducted on recovering the reference text under Bayes noise channel model by recognition result of sense garbage grammar. Results demonstrate its effectiveness. Compared with baseline approaches, proposed method can effectively enhance the performance and provide feedbacks with reconstructed sense group; 3)Moreover, for other disfluent errors in speech, under the theme of feature exploration and classifier, detecting ``error pauses'' via prosodic features around pitch break, and detecting ``filled pause'' via invariant property of formant are also proposed, all of which can be combined in concrete framework. Result indicates that repetition remains the main factor affecting speech disfluency, and concrete framework can better approximate human expert; 4) Towards fluency evaluation, considering in generalized fluency against traditional smooth based fluency, this study takes a pilot explore in advanced skills in spoken English and applies results in diagnosis. Experimental result of correlation suggests that the two representations of fluency can complement to each other; 5) Towards properties in prosody evaluation, cognitive significance are systematically analyzed in different views. Results suggest that adaptation in prosodic diversity and variance modeling in pitch, duration and energy are two keys in prosody evaluation. Moreover, rhythm and detailed prosodic unit are more i...
修改评论