Realistic Visual Speech Synthesis Based on Hybrid Concatenation Method
Tao, Jianhua; Xin, Le; Yin, Panrong
发表期刊IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING
2009-03-01
卷号17期号:3;3页码:469-477
文章类型Article
摘要This paper presents a realistic visual speech synthesis based on the hybrid concatenation method. Unlike previous methods based on phoneme level unit selection or hidden Markov model (HMM), etc., the hybrid concatenation method uses a frame level-based unit selection method combined with a fused HMM, and is able to generate more expressive and stable facial animations. The fused HMM can be used to explicitly model the loose synchronization of tightly coupled streams, with much better results than a normal HMM for audiovisual mapping. After fused HMM is created, facial animation is generated via the unit selection method at the frame level by using the fused HMM output probabilities. To accelerate the computing efficiency of the unit selection on a large corpus, this paper also proposes a two-layer. Viterbi search method in which only the subsets that have been selected in the first layer are further checked in the second layer. Using this idea, the system has been successfully integrated into real-time applications. Furthermore, the paper also proposes a mapping method to generate emotional facial expressions from neutral facial expressions based on Gaussian mixture models (GMMs). Final experiments prove that the method described can output synthesized facial parameters with high quality. Compared with other audiovisual mapping methods, our method has better performance with respect to expressiveness, stability, and system running speed.
关键词Fused Hidden Markov Model (Hmm) Inversion Speech-driven Facial Animation Unit Concatenation Visual Speech Synthesis
WOS标题词Science & Technology ; Technology
关键词[WOS]HIDDEN MARKOV-MODELS ; ANIMATION ; CONVERSION ; FACE
收录类别SCI
语种英语
WOS研究方向Acoustics ; Engineering
WOS类目Acoustics ; Engineering, Electrical & Electronic
WOS记录号WOS:000263639400007
引用统计
被引频次:11[WOS]   [WOS记录]     [WOS相关记录]
文献类型期刊论文
条目标识符http://ir.ia.ac.cn/handle/173211/40965
专题多模态人工智能系统全国重点实验室_智能交互
推荐引用方式
GB/T 7714
Tao, Jianhua,Xin, Le,Yin, Panrong. Realistic Visual Speech Synthesis Based on Hybrid Concatenation Method[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING,2009,17(3;3):469-477.
APA Tao, Jianhua,Xin, Le,&Yin, Panrong.(2009).Realistic Visual Speech Synthesis Based on Hybrid Concatenation Method.IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING,17(3;3),469-477.
MLA Tao, Jianhua,et al."Realistic Visual Speech Synthesis Based on Hybrid Concatenation Method".IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING 17.3;3(2009):469-477.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Tao, Jianhua]的文章
[Xin, Le]的文章
[Yin, Panrong]的文章
百度学术
百度学术中相似的文章
[Tao, Jianhua]的文章
[Xin, Le]的文章
[Yin, Panrong]的文章
必应学术
必应学术中相似的文章
[Tao, Jianhua]的文章
[Xin, Le]的文章
[Yin, Panrong]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。