CASIA OpenIR  > 智能交互
Deep Metric Learning for the Target Cost in Unit-Selection Speech Synthesizer
Ruibo Fu1,2; Jianhua Tao1,2,3; Yibin Zheng1,2; Zhengqi Wen1
2018-09
Conference NameINTERPSEECH2018
Conference Date2018-9
Conference Place印度海得拉巴
Abstract

This paper describes a unified Deep Metric Learning (DML) framework to predict the target cost directly by supervised learning method. The conventional methods to calculate the target cost include two separate steps: feature extraction and standard distance measurement. The proposed DML framework aims to measure the similarity between the candidate units and the target units more reasonably and directly. Firstly, the symmetrical DML framework is pre-trained to learn the metric between pairs of candidate units and the target units. The relabeling procedure is added to correct the initial designed label of the target cost. Secondly, the acoustic features of the target units is removed, which fits the runtime of the unit-selection synthesizer. The asymmetrical DML is fine-tuned to learn the metric between candidate units and target units. Compared to the conventional methods, the proposed unified DML framework can avoid the accumulation of errors in separate steps and improve the accuracy in labeling and predicting the target cost. The evaluation results demonstrate that the naturalness of synthetic speech has been improved by adopting DML framework to predict target cost.

Keywordspeech synthesis unit-selection target cost deep metric learning
Indexed ByEI
Language英语
Document Type会议论文
Identifierhttp://ir.ia.ac.cn/handle/173211/39597
Collection智能交互
Corresponding AuthorRuibo Fu
Affiliation1.National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences
2.CAS Center for Excellence in Brain Science and Intelligence Technology
3.School of Artificial Intelligence, University of Chinese Academy of Sciences
First Author AffilicationChinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China
Corresponding Author AffilicationChinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China
Recommended Citation
GB/T 7714
Ruibo Fu,Jianhua Tao,Yibin Zheng,et al. Deep Metric Learning for the Target Cost in Unit-Selection Speech Synthesizer[C],2018.
Files in This Item: Download All
File Name/Size DocType Version Access License
INTERSPEECH2018_1305(323KB)会议论文 开放获取CC BY-NC-SAView Download
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Ruibo Fu]'s Articles
[Jianhua Tao]'s Articles
[Yibin Zheng]'s Articles
Baidu academic
Similar articles in Baidu academic
[Ruibo Fu]'s Articles
[Jianhua Tao]'s Articles
[Yibin Zheng]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Ruibo Fu]'s Articles
[Jianhua Tao]'s Articles
[Yibin Zheng]'s Articles
Terms of Use
No data!
Social Bookmark/Share
File name: INTERSPEECH2018_1305(语音顶会).pdf
Format: Adobe PDF
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.