英文摘要 | The direction of arrival (DOA) of the sound source is essential to many intelligent speech signal processing systems. For instance, the video camera can automatically steer its direction to the active speaker when the speaker's direction is known; a beamfomer, which is a spatial filter, can be designed to enhance the signal from the speech direction while suppressing the signals from other directions; also, the intelligent vehicle can determine the sender of a speech command according to the estimated speech DOA, or perceive its surrounding environment by exploiting the DOA information of the sound sources outside the vehicle. Usually, a microphone array, which is composed by a set of microphones with an specific geometry, can be used for DOA estimation. However, the noise, which appears almost everywhere in the real world, brings great challenges to high-performance DOA estimation. In this thesis, based on the comprehensive investigation on the state-of-art DOA estimation methods, by analyzing the properties of the noise in real world conditions, we study the noise robust DOA estimation problem in different transform domains. A series of algorithms have been proposed, which utilizes the advantages of representing the speech signal in different transform domains to improve the robustness in noisy environments. Moreover, with respect to the application of DOA estimation to the intelligent vehicle, we have designed and made the microphone array DOA estimation prototype system, and tested the DOA estimation performance using real recordings. The main contributions and novelties of this thesis work include: (1) For the nondirectional low signal-to-noise(SNR) noise conditions, in the auditory spectrum domain, we propose a DOA estimation method based on sub-band weighting. As the target speech and the noise have different frequency distributions, different sub-bands of speech are not equally effected by the noise. Assuming that the noise signal has a flatter energy distribution over different sub-bands than the speech, then the sub-bands with high energies can be considered to contain more speech components. Therefore, it can be expected that by performing DOA estimation in each sub-band, and emphasizing the estimation results in speech bands, the robustness of the algorithm against the noise can be improved. The experiments in different undirected noisy environments show that the proposed method can achieve better performance than the conventional method... |
修改评论