Knowledge Commons of Institute of Automation,CAS
Pitch-Scaled Analysis based Residual Reconstruction for Speech Analysis and Synthesis | |
Wen, Zhengqi; Kawahara, Hideki; Tao, Jianhua; Zhengqi Wen | |
2012 | |
会议名称 | 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012 |
会议录名称 | Annual Conference of the International Speech Communication Association (INTERSPEECH) |
页码 | 374-377 |
会议日期 | 2012 |
会议地点 | 美国 |
摘要 | The typical problem in LPC-like vocoder is buzzing sound which is mainly due to the simple pulse train or noise excitation model. One way to improve it is to reconstruct the residual obtained from inverse filtering. So a new parametric representation of speech based on pitch-scaled analysis is proposed in this paper. Pitch-scaled analysis is used to extract the periodic spectrum of residual with half pitch period length. Then these periodic spectrums are de-correlated by principal component analysis (PCA) to reduce their dimension. Aperiodic measure is defined as the harmonic-to-noise ratio in the frequency domain where voicing cut-off frequency (VCO) is used to control the smoothness of aperiodicity. Periodic spectrum and aperiodic measure together with F0 are indicated as excitation parameters in the proposed LPC vocoder. Experimental results show that this proposed vocoder can get a mean opinion score (MOS) of 4.1 for a female voice before dimensionality reduction and keep the high-quality property after parameter compression. |
关键词 | Speech Parametric Representation Pitch-scaled Analysis Voicing Cut-off Frequency Principal Component Analysis |
收录类别 | EI |
文献类型 | 会议论文 |
条目标识符 | http://ir.ia.ac.cn/handle/173211/41278 |
专题 | 多模态人工智能系统全国重点实验室_智能交互 |
通讯作者 | Zhengqi Wen |
推荐引用方式 GB/T 7714 | Wen, Zhengqi,Kawahara, Hideki,Tao, Jianhua,et al. Pitch-Scaled Analysis based Residual Reconstruction for Speech Analysis and Synthesis[C],2012:374-377. |
条目包含的文件 | 条目无相关文件。 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论