This paper studied and resolved the three key issues encountered in the application of the Chinese speech recognition system in the intelligent terminal equipment, namely how to reduce the calculation and storage resource consumption, improve the robustness of voice recognition systems and to deal with the modeling and search problems faced by Chinese-accent English speech recognition and Chinese-English bilingual mix of speech recognition. To reduce the computing and storage resources consumption of Speech Recognition System: 1. On the acoustic model parameters sharing technology, we proposed TM-SDCHMM model based on continuous probability distribution function and SDC-DHMM model based on discrete probability distribution function. The model does not reduce the accuracy or slightly lower accuracy of the model and decreases the model complexity. 2. By simplifying acoustic scores, and high-precision path pruning based on the online confidence measure, we reduced the size of the search space and improved the efficiency of the decoder. 3. For fixed-point processor, we made a speech recognition system based on fixed data type and computation and model parameters pre-computing. To improve the robustness of speech recognition system: 4. In the signal space, we proposed a signal-processing oriented integrated speech pretreatment methods, applicable to the complex environment of embedded speech communication applications, including: abnormal signal detection and filtering, TMWF-based speech enhancement and the voice activity detection algorithm based on subspace energy and edge detection filters. 5. In the space of audio feature and acoustic model, we studied the features regularity, features smoothing and Multi-condition training methods of the acoustic model. 6. At the application level of the system, we studied the mechanism of multi-candidate, the posterior probability based and phone confusion based confidence measure, adaptive gain control based background noise suppression, and guidance phrase based OOV rejection, which improved the robustness of speech recognition system in the practical application environment. In dealing with the Chinese people speaking English and Chinese-English bilingual mixed speech recognition: 7. Based on analysis of the English Chinese-accented corpus, we proposed expansion of the English acoustic modeling unit, making Chinese-style English can also reach a higher recognition rate. 8. Based on the analysis of acoustic model mis-matching problem in the recognition of bilingual mixed speech, we proposed manual adjustments and mixed model method to balance the model precision of different language. In this paper, the research results have been successfully applied in voice dialing software and the different embedded devices, embedded operating systems, embedded microprocessor.
修改评论