The history and current situation of research in speech recognition are proposed all-sidedly and systematically. We probe into such topics about acoustic features, phonetic features, phonologic features and auditory perception in terms of related theory and disciplines. Furthermore, we summarize and analyze focus in current theoretical and technical aspects of speech recognition and immediately put an insight into challenge to speech recognition presently, even in a long time. During my studies for Ph D degree, my research interests cover such important subjects in speech recognition as auditory model, speech features analysis and selection, acoustic-phonetic modeling, paradigms for HMM training and corresponding algorithms, and new approaches to speech recognition. To combat the problems we address in these subjects, we present some original approaches effective for improving recognition rate and make some theoretical research in these subjects. On acoustic-phonetic modelmg, a key circle in speech recognition, we 1. Systematically analyze and summarize two sorts of approaches--acoustic-phonetic modeling based on the structure of unit HMMs and phonetic units and then propose such new terms as acoustic- phonetic symbol system to break through conventional framework. A specific conception based on the system illustrates the importance and necessity of the proposed acoustic-phonetic symbol system in speech recognition. In addition, the analysis gives that this system makes it possible to organically merge symbol-based transform among intermediate sets of the system, acoustic events and linguistic units, and to improve acoustic-phonetic modeling accuracy further. 2. Point out: in conventional acoustic-phonetic modeling approaches, a word template can be obtained by simply cascading unit templates. In Viterbi decoding process, consonant units contribute less to the gross matching score than vowel units due to their short duration. In transition segment with serious co-articulation effect phonetic features of consonants and vowels extend and overlap. To combat the above problems, we propose a new model, that is, unit-template-coupled model. In speech transition segment any observation is generated by coupled unite templates. The unit- template-coupled model makes it possible for all unit template to share the transition segment and thus duration of consonants is relatively lengthened. Additionally, the model can depict the co- articulation effect in the articulation more accurately. No major revision of Baum-Welch algorithm is required to estimate the desired parameters. The experimental result gives that the error rate is decreased remarkably in coupled-model-based speech recognition. On HMM training and estimation of parameters, a kernel problem in speech recognition, we 1. Analyze: due to different duration of speech materials for training, Baum auxiliary function results
修改评论