CASIA OpenIR  > 模式识别国家重点实验室  > 语音交互
基于医学影像的语音驱动舌位运动合成
张大伟1; 杨明浩1,3; 陶建华1,2,3
2017-10
Conference Name第十四届全国人机语音通讯学术会议 (NCMMSC 2017)
Conference Date2017-10-11~13
Conference Place中国连云港
Abstract发音器官运动可视化对人类发音机理研究、语言教学和病理语音分析等具有重要意义。本文提出一种文本无关的语音驱动舌位运动合成方法,基于医学影像和舌位轮廓自动提取方法,利用组合深度神经网络模型实现舌位运动的实时合成,并对舌位轮廓去噪、声学特征选取、映射模型结构等进行对比分析。实验表明,本文所提方法在有限带噪数据样本下能有效平衡过拟合与欠拟合问题,相比基线方法在准确率上有明显提升,甚至个别关键点预测结果优于自动提取的舌位轮廓边缘点。
Other AbstractArticulatory motion visualization is very important in research on human pronunciation mechanism, language teaching and pathological speech analysis. A text-independent speech-driven tongue motion synthesis method is proposed. Based on medical image data and the automatic extraction method, tongue can be synthesized by using combined deep neural network models. Moreover, tongue contour denoising, acoustic features selection and different mapping model structures are compared and analyzed. Experiments show that the proposed method can effectively solve the over or under fitting problems with limited noisy samples, and the accuracy of predicted tongue contours is obviously higher than baseline methods and even higher than the extracted contours in some key points.
Keyword舌位运动合成 语音驱动 医学影像 组合深度神经网络
Document Type会议论文
Identifierhttp://ir.ia.ac.cn/handle/173211/19834
Collection模式识别国家重点实验室_语音交互
Affiliation1.中国科学院自动化研究所模式识别国家重点实验室
2.中国科学院大学人工智能技术学院
3.中国科学院脑科学与智能技术卓越创新中心
First Author AffilicationChinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China
Recommended Citation
GB/T 7714
张大伟,杨明浩,陶建华. 基于医学影像的语音驱动舌位运动合成[C],2017.
Files in This Item: Download All
File Name/Size DocType Version Access License
NCMMSC2017_基于医学影像的语音(497KB)会议论文 开放获取CC BY-NC-SAView Download
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[张大伟]'s Articles
[杨明浩]'s Articles
[陶建华]'s Articles
Baidu academic
Similar articles in Baidu academic
[张大伟]'s Articles
[杨明浩]'s Articles
[陶建华]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[张大伟]'s Articles
[杨明浩]'s Articles
[陶建华]'s Articles
Terms of Use
No data!
Social Bookmark/Share
File name: NCMMSC2017_基于医学影像的语音驱动舌位运动合成 - final2.pdf
Format: Adobe PDF
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.