CASIA OpenIR  > 毕业生  > 博士学位论文
英语口语超音段层次自动检错与评估技术的研究
其他题名Research on Automatic Supra-segmental Evaluation and Diagnosis for Spoken English
黄申
学位类型工程博士
导师徐波
2011-06-03
学位授予单位中国科学院研究生院
学位授予地点中国科学院自动化研究所
学位专业模式识别与智能系统
关键词语音识别 计算机辅助语言学习 超音段 流利 韵律 Speech Recognition Computer Aided Language Learning Supra-segmental Feature Fluency Prosody
摘要1) 针对封闭式题型``重复修正''检错中容错对齐和噪声过滤两个难点。提出了一系列从不同知识角度建模的二次容错对齐模型,以及减低噪声的修正搜索过滤算法。实验结果表明:细化建模单元和建立垃圾网络是解决容错对齐问题的有效途径;而基于随机假设的修正搜索过滤算法要优于基于顺序假设的算法,能够更有效地处理修正时的残缺词、倒装、语法错乱等现象; 2) 针对开放式题型``重复修正''检错无法获取意群脚本的难点。提出了一种基于贝叶斯噪声恢复模型的最优意群重建算法。与现有各基线方法进行对比,不仅能够有效地提升检错的性能,同时能够提供准确的意群脚本反馈信息; 3) 通过研究类型各自的超音段特征表现,提出了针对``错误停顿''和``插入垃圾''两种错误的信号级检错方法,并将该方法与基线方法进行了深入的对比实验。在统一的数据和指标框架下,实验结果表明:重复修正是影响流利感知最大的因素。采用模糊化的流利类型以及统一的检错框架,能够使机器和人的检错结果具有更好的一致性; 4) 针对流利评估问题,提出了一种狭义的通顺性流利与广义流利相结合的特征提取和拟分方法,并在该方法基础上利用检错结果来反馈指导评估。结果表明:通顺流利和广义流利具有互补性,在两者融合基础上构造的非线性拟分模型可以有效提高机器评估的准确性,使其达到或超过人工评估水平; 5) 在韵律评估的知识方面,对影响韵律的各感知因素及其显著性进行了深入的研究和分析。结果表明:对韵律风格多样性的适应,以及基于基频、时长、能量变化特性的建模是韵律评估的关键;此外,与音乐旋律感知不同,英语韵律感知是语调和节奏共同作用的结果,其中,从细节出发的节奏信息更为重要; 6)在韵律评估的方法方面,从超音段、音段以及规则三个层次分别进行了研究:在超音段层次,以现有方法为基线系统,提出了基于韵律产生和自然度影响的韵律模型得分;在音段层次,提出了音段韵律得分;在规则层次,以哼唱识别为原理,提出了多韵律风格模板匹配算法,将韵律分为语调和节奏进行建模。上述三种建模角度的系统均取得了一定效果,将其进行融合,最终形成行之有效的韵律评估方法。可有效地在高分段中提高评估的准确性,并达到或超过人工的评估水平; 7)在海量数据背景下,提出一种基于轻监督学习的海量数据挖掘多韵律风格算法和针对韵律评估的知识集分裂方案,以达到在多步Co-Training 迭代中性能的提升。本算法能够半自动地在海量数据集中进行扩展标注,为挖掘海量数据的韵律评估提供了可能。
其他摘要For the spoken English test for large scale crowd, this thesis will perform systemic researches on the assessment, diagnosis and feedback technology of spoken language, and the corresponding contributions and innovation highlights are summarized as follows: 1)For ``repetition'' detection in constrained type, the major problem is matching and existence of noise. To deal with that, a second pass recognition grammar built with a series of fine-grained re-matching models is presented, which is fused in frame level. After generating candidate reparandum and repair, repair filtering is performed in different segmental levels and order assumptions. i.e. K-difference filter in regular order and N-gram filter in random order. Result indicates that introducing re-matching and small modeling unit are two keys coping with speech in miscues, and repair filtering in random order is superior to regular order, showing that reparandum is better in detail deemed as the~``rough copy'' of pseudo-random occurrence of sub-word; 2) For ``repetition'' detection in open-ended type, research is conducted on recovering the reference text under Bayes noise channel model by recognition result of sense garbage grammar. Results demonstrate its effectiveness. Compared with baseline approaches, proposed method can effectively enhance the performance and provide feedbacks with reconstructed sense group; 3)Moreover, for other disfluent errors in speech, under the theme of feature exploration and classifier, detecting ``error pauses'' via prosodic features around pitch break, and detecting ``filled pause'' via invariant property of formant are also proposed, all of which can be combined in concrete framework. Result indicates that repetition remains the main factor affecting speech disfluency, and concrete framework can better approximate human expert; 4) Towards fluency evaluation, considering in generalized fluency against traditional smooth based fluency, this study takes a pilot explore in advanced skills in spoken English and applies results in diagnosis. Experimental result of correlation suggests that the two representations of fluency can complement to each other; 5) Towards properties in prosody evaluation, cognitive significance are systematically analyzed in different views. Results suggest that adaptation in prosodic diversity and variance modeling in pitch, duration and energy are two keys in prosody evaluation. Moreover, rhythm and detailed prosodic unit are more i...
馆藏号XWLW1605
其他标识符200818014628039
语种中文
文献类型学位论文
条目标识符http://ir.ia.ac.cn/handle/173211/6388
专题毕业生_博士学位论文
推荐引用方式
GB/T 7714
黄申. 英语口语超音段层次自动检错与评估技术的研究[D]. 中国科学院自动化研究所. 中国科学院研究生院,2011.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
CASIA_20081801462803(3989KB) 暂不开放CC BY-NC-SA请求全文
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[黄申]的文章
百度学术
百度学术中相似的文章
[黄申]的文章
必应学术
必应学术中相似的文章
[黄申]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。