CASIA OpenIR  > 模式识别国家重点实验室  > 语音交互
BLSTM-CRF Based End-to-End Prosodic Boundary Prediction with Context Sensitive Embeddings in A Text-to-Speech Front-End
Yibin Zheng1,2; Jianhua Tao1,2; Zhengqi Wen1; Ya Li1
2018-09
Conference NameAnnual Conference of the International Speech Communication Association-Interspeech
Conference Date2-6 September 2018
Conference PlaceHyderabad
Abstract

In this paper, we propose a language-independent end-to-end architecture for prosodic boundary prediction based on BLSTM-CRF. The proposed architecture has three components, word embedding layer, BLSTM layer and CRF layer. The word embedding layer is employed to learn the task-specific embeddings for prosodic boundary prediction. The BLSTM layer can efficiently use both past and future input features, while the CRF layer can efficiently use sentence level information. We integrate these three components and learn the whole process end-to-end. In addition, we investigate both character-level embeddings and context sensitive embeddings to this model, and employ an attention mechanism for combining alternative word-level embeddings. By using an attention mechanism, the model is able to decide how much information to use from each level of embeddings. Objective evaluation results show the proposed BLSTM-CRF architecture achieves the best results on both Mandarin and English datasets, with an absolute improvement of 3.21% and 3.74% in F1 score, respectively, for intonational phrase prediction, compared to previous state-of-the-art method (BLSTM). The subjective evaluation results further indicate the effectiveness of the proposed methods.

KeywordProsodic Boundary Prediction Blstm-crf Attention Context Sensitive Embeddings End-to-end
Indexed ByEI
Document Type会议论文
Identifierhttp://ir.ia.ac.cn/handle/173211/23582
Collection模式识别国家重点实验室_语音交互
Affiliation1.National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences
2.School of Computer and Control Engineering, University of Chinese Academy of Sciences
First Author AffilicationInstitute of Automation, Chinese Academy of Sciences
Recommended Citation
GB/T 7714
Yibin Zheng,Jianhua Tao,Zhengqi Wen,et al. BLSTM-CRF Based End-to-End Prosodic Boundary Prediction with Context Sensitive Embeddings in A Text-to-Speech Front-End[C],2018.
Files in This Item: Download All
File Name/Size DocType Version Access License
BLSTM-CRF Based End-(642KB)会议论文 开放获取CC BY-NC-SAView Download
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Yibin Zheng]'s Articles
[Jianhua Tao]'s Articles
[Zhengqi Wen]'s Articles
Baidu academic
Similar articles in Baidu academic
[Yibin Zheng]'s Articles
[Jianhua Tao]'s Articles
[Zhengqi Wen]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Yibin Zheng]'s Articles
[Jianhua Tao]'s Articles
[Zhengqi Wen]'s Articles
Terms of Use
No data!
Social Bookmark/Share
File name: BLSTM-CRF Based End-to-End Prosodic Boundary Prediction with Context Sensitive Embeddings in A Text-to-Speech Front-End.pdf
Format: Adobe PDF
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.