With the development of computer, Internet, Multimedia Information Technology, Multimedia Information Retrieval becomes an impending requirement of the web application. Focused on audio/video (AV) stream information processing, this dissertation presents the researches on content-based audio classification and retrieval. The work of the dissertation mainly includes following contributions: To cope with the issue of AV stream segmentation and classification, firstly, a background-sound based audio stream segmentation algorithm is proposed to partially avoid the detection errors that are caused by the complexity of audio content. Because of the low computation load of histogram intersection method, the segmentation algorithm is faster and has fewer errors than traditional algorithm. Then, a hierarchical classification algorithm based on modified Gaussian model (MGM) is employed for audio classification. Compared to traditional Gaussian model, MGM improved the drawback that all features are of the same weight in GM. A histogram audio search algorithm based on multiple sub-band energy features is presented for detecting and locating object audio clip in continuous AV stream. The algorithm has been truly applied in TV/radio program monitoring system at present. An improved method of audio search based on multiple critical-bands modules is introduced to enhance the robust of the audio search algorithm under complex environments. The method improves the performance by using audio distortion eliminating algorithm and 2nd search algorithm. A music retrieval method of query by humming (QBH) based on pitch envelope is studied. The pitch envelope extraction is based on spectrum autocorrelation. The similarity was computed by DTW, whose search path is constrained. Experimental results have shown that the algorithm may contribute to the robustness of QBH in some sense.
修改评论