CASIA OpenIR  > 毕业生  > 博士学位论文
汉语层级重音分析与预测方法研究
其他题名Research on Hierarchical Analysis and Prediction of Stress in Chinese
李雅
学位类型工学博士
导师陶建华
2012-05-31
学位授予单位中国科学院研究生院
学位授予地点中国科学院自动化研究所
学位专业模式识别与智能系统
关键词文语转换 语音合成 韵律 重音 韵律预测 重音生成 Text-to-speech Speech Synthesis Prosody Stress Prosody Prediction Stress Generation
摘要韵律是语音中的超音段特征,它能够促进和增补语义、语用等信息的表达,因而在口语交流中起着非常重要的作用,自然也成为语音语言科学研究和言语工程的一个重要组成部分。以往的韵律研究多侧重在节奏层级的划分,鲜有系统全面的汉语重音研究。本文以普通话重音为研究对象,介绍了重音语料库的建设,汉语重音的层级韵律分析、建模,进而探讨了重音在文本层面的特征,并构建了多个基于文本的重音预测模型。最后,以语音合成应用为例,介绍了重音研究对言语工程的促进作用。 总体而言,本文对该领域的贡献和创新点有: 1、构建了大规模的重音标注语料库,并细致分析了汉语词重音和句重音的韵律表现。研究发现基频对重音的感知影响较大,且不同韵律层级和调型组合会影响词重音的感知,这种感知差异在时长和基频两方面表现不尽相同。在连续语流中,双音节韵律词的重音稳定性较低。最后,本文采用回归分析和决策树分类两种方法对语流中的重音进行了自动检测,该工作有利于将重音研究引入到语音识别中,也有利于快速构建重音标注的语料库。 2、探讨了重音在文本层面的特征,重音与句法的关系,总结了句法到重音的映射关系,并提出了句法到重音的映射模型;同时,采用文本特征利用分类与回归树和最大熵模型构建了多个重音预测模型;另外,以最大熵模型为基准,设计了封装式的特征模板选择方法,提高了最大熵模型的重音预测性能。 3、根据汉语重音的特点,本文主张要加强轻音的研究,并据此提出了新颖的层级重音建模和预测方法:即通过句子与韵律词两个层面刻画重音。根据不同层级的作用,在句重音层级,侧重重音音节的建模分析;而在韵律词重音层级,侧重轻音音节的建模分析。层级重音建模能够兼顾全局和局部的两个层次的韵律特征,同时保证每层模型具有较高的正确率和召回率,使得模型能够可靠地从任意输入的文本得到较为细致的重音等级标注结果。 4、以语音合成为例,介绍了汉语重音在言语工程中的应用。分别在基元选取系统、基于隐马尔科夫模型的统计参数语音合成系统中完成了重音的生成。同时也结合Fujisaki模型的层级基频建模思路,完成了层级重音生成。对这三种合成语音的客观评价和人工听测表明,重音的融入能够明显提高合成语音的表现力和自然度,其中,表现力的提高更为显著。 汉语韵律的研究一直是语音语言科学和言语工程的关键和瓶颈之一。以上对汉语重音的深入系列研究对加深汉语重音的理解、提高语音识别、语音合成、口语对话系统的性能具有一定的意义。本文采用的一些方法不仅对汉语重音研究有帮助,也可以推广到自然语言处理的其它领域,具有一定的普遍意义。
其他摘要Prosody is the super-segmental feature of speech, and it can promote, compensate to express semantics and pragmatics, thus it plays an important role in spoken communication, which makes it become the research focus of speech language sciences and technologies. The traditional prosody studies emphasis on rhythm, whereas this paper focuses on Mandarin stress. First, we built a large-scale stress-annotated corpus and carried out a comprehensive analysis and modeling of Mandarin word stress and sentence stress. Second, we constructed several text-based stress prediction models. A novel hierarchical Mandarin stress modeling and prediction approach is proposed. Finally, the stress study was applied to Text-to-Speech as an example to show how stress study promotes speech and language engineering. In detail, the main contributions of this dissertation include: 1. A large-scale stress-annotated corpus was built and a detailed analysis of Mandarin stress was carried out. Six thousand utterances were annotated with word stress and sentence stress. Statistical analysis points out that pitch is the first cue in Mandarin stress perception in continuous speech. For word stress, perceptual difference in the different rhythm levels and tone patterns has obvious regularities. There is no significant stability in disyllabic stress patterns. This dissertation also utilizes pitch, duration and their statistical parameters to detect stress automatically in continuous speech using Multiple Linear Regression Analysis and Decision Trees. 2. The prosody and syntax interface was investigated and a series of syntax-stress mapping rules were summed up. Classification and Regression Tree (CART) and Maximum Entropy (ME) model were also employed to predict word stress and sentence stress which only use textual features. Model optimization was conducted with feature selection under the framework of ME model. Experiments show the optimized model outperforms the baseline with fewer features, which would reduce the training and running time and also the model size. 3. To strengthen the unstressed syllable study in Mandarin, this dissertation proposed a novel hierarchical Mandarin stress modeling and prediction method. The top level emphasizes stressed syllables, while the bottom level focuses on unstressed syllables for the first time because of its importance in both naturalness and expressiveness of synthetic speech. Prediction experiments confirmed the modeling method could capture t...
馆藏号XWLW1736
其他标识符200918014628033
语种中文
文献类型学位论文
条目标识符http://ir.ia.ac.cn/handle/173211/6464
专题毕业生_博士学位论文
推荐引用方式
GB/T 7714
李雅. 汉语层级重音分析与预测方法研究[D]. 中国科学院自动化研究所. 中国科学院研究生院,2012.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
CASIA_20091801462803(1859KB) 暂不开放CC BY-NC-SA请求全文
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[李雅]的文章
百度学术
百度学术中相似的文章
[李雅]的文章
必应学术
必应学术中相似的文章
[李雅]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。