CASIA OpenIR

浏览/检索结果: 共10条,第1-10条 帮助

限定条件    
已选(0)清除 条数/页:   排序方式:
Adversarial Multi-Task Learning for Mandarin Prosodic Boundary Prediction With Multi-Modal Embeddings 期刊论文
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 卷号: 31, 页码: 2963-2973
作者:  Yi, Jiangyan;  Tao, Jianhua;  Fu, Ruibo;  Wang, Tao;  Zhang, Chu Yuan;  Wang, Chenglong
收藏  |  浏览/下载:52/0  |  提交时间:2023/11/17
Adversarial training  multi-task learning  prosodic boundaries  speech synthesis  multi-modal embeddings  
NeuralDPS: Neural Deterministic Plus Stochastic Model With Multiband Excitation for Noise-Controllable Waveform Generation 期刊论文
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 卷号: 30, 页码: 865-878
作者:  Wang, Tao;  Fu, Ruibo;  Yi, Jiangyan;  Tao, Jianhua;  Wen, Zhengqi
收藏  |  浏览/下载:262/0  |  提交时间:2022/06/06
Vocoders  Stochastic processes  Neural networks  Speech processing  Signal to noise ratio  Acoustics  Speech enhancement  Vocoder  speech synthesis  deterministic plus stochastic  multiband excitation  noise control  
Unconstrained end-to-end text reading with feature rectification 期刊论文
PATTERN RECOGNITION LETTERS, 2021, 卷号: 149, 页码: 1-8
作者:  Du, Chen;  Wang, Yanna;  Wang, Chunheng;  Xiao, Baihua;  Shi, Cunzhao
Adobe PDF(1133Kb)  |  收藏  |  浏览/下载:300/63  |  提交时间:2021/11/02
Text recognition  Text detection  Position-sensitive network  Features incompatibility  End-to-end  
Integrating Knowledge Into End-to-End Speech Recognition From External Text-Only Data 期刊论文
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 卷号: 29, 页码: 1340-1351
作者:  Bai, Ye;  Yi, Jiangyan;  Tao, Jianhua;  Wen, Zhengqi;  Tian, Zhengkun;  Zhang, Shuai
收藏  |  浏览/下载:169/0  |  提交时间:2021/06/07
End-to-End  language modeling  speech recognition  teacher-student learning  transfer learning  
Gated Recurrent Fusion With Joint Training Framework for Robust End-to-End Speech Recognition 期刊论文
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 期号: 29, 页码: 198-209
作者:  Fan, Cunhang;  Yi, Jiangyan;  Tao, Jianhua;  Tian, Zhengkun;  Liu, Bin;  Wen, Zhengqi
Adobe PDF(2534Kb)  |  收藏  |  浏览/下载:396/50  |  提交时间:2021/03/08
Speech enhancement  Speech recognition  Training  Noise measurement  Logic gates  Acoustic distortion  Task analysis  Gated recurrent fusion  robust end-to-end speech recognition  speech distortion  speech enhancement  speech transformer  
Adversarial learning based attentional scene text recognizer 期刊论文
PATTERN RECOGNITION LETTERS, 2020, 卷号: 138, 期号: 1, 页码: 217-222
作者:  Zhao, Jinyuan;  Wang, Yanna;  Xiao, Baihua;  Shi, Cunzhao;  Jiang, Jingzhong;  Wang, Chunheng
Adobe PDF(1152Kb)  |  收藏  |  浏览/下载:354/84  |  提交时间:2021/01/07
Scene text recognition  Generative adversarial network  Image rectification  
DetectGAN: GAN-based text detector for camera-captured document images 期刊论文
INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2020, 卷号: 23, 期号: 4, 页码: 267-277
作者:  Zhao, Jinyuan;  Wang, Yanna;  Xiao, Baihua;  Shi, Cunzhao;  Jia, Fuxi;  Wang, Chunheng
Adobe PDF(3817Kb)  |  收藏  |  浏览/下载:316/55  |  提交时间:2020/09/21
Text detection  Camera-captured document images  Multi-scale context features  Generative adversarial networks  
End-to-End Post-Filter for Speech Separation With Deep Attention Fusion Features 期刊论文
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 卷号: 28, 期号: 28, 页码: 1303-1314
作者:  Fan, Cunhang;  Tao, Jianhua;  Liu, Bin;  Yi, Jiangyan;  Wen, Zhengqi;  Liu, Xuefei
Adobe PDF(1344Kb)  |  收藏  |  浏览/下载:301/65  |  提交时间:2020/06/22
Feature extraction  Training  Interference  Speech enhancement  Clustering algorithms  Spectrogram  Speech separation  end-to-end post-filter  deep attention fusion features  deep clustering  permutation invariant training  
Selective feature connection mechanism: Concatenating multi-layer CNN features with a feature selector 期刊论文
PATTERN RECOGNITION LETTERS, 2020, 卷号: 129, 页码: 108-114
作者:  Du, Chen;  Wang, Chunheng;  Wang, Yanna;  Shi, Cunzhao;  Xiao, Baihua
Adobe PDF(2583Kb)  |  收藏  |  浏览/下载:352/51  |  提交时间:2020/03/30
Feature combination  Network architecture  Selective feature connection mechanism  Convolutional neural network  
Forward-Backward Decoding Sequence for Regularizing End-to-End TTS 期刊论文
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 卷号: 27, 期号: 12, 页码: 2067-2079
作者:  Zheng, Yibin;  Tao, Jianhua;  Wen, Zhengqi;  Yi, Jiangyan
收藏  |  浏览/下载:350/0  |  提交时间:2020/03/30
Decoding  Training  Speech processing  Linguistics  Acoustics  Speech recognition  Forward-backward  regularization  encoder-decoder with attention  end-to-end  joint-training  TTS