Multilingual language identification (LID) is a procedure of identifying the language corresponding to the certain speech segment; it plays an increasingly important role in speech information services, multilingual speech translation, and security surveillance. This paper presents the recent progress obtained in the research on multilingual LID technology including acoustic modeling, language modeling, and system combination. Firstly, we build a baseline PPRLM system, and study the effects of language model smoothing, speech channel, speaking style, with a baseline recognition accuracy of 77.81% obtained. Secondly, we incorporate NN-HMM based acoustic modeling to LID, which can achieve about 10% improvement; furthermore, we study several clustering algorithms, and propose an algorithm to build multilingual acoustic model, which gets comparable accuracy with PPRLM system. After combination with PPRLM system, the performance achieves about 2% improvement. Thirdly, we propose a method of binary-decision tree language modeling and random forest based binary-decision tree language model in PPRLM, which achieves about 6% improvement; and then we propose lattice-based SVM language modeling, which achieves about 8% improvement. Finally we integrate the techniques of acoustic language identification algorithms and LDA-Gaussian based system combination algorithm. Our system achieves a recognition accuracy of 98.75% after combination of NIST 2003 30s LRE data, the final performance is also comparable to the recent LID systems in the world.
修改评论