英文摘要 | After years of development, the internet has facilitated people's lives at all levels of society. Internet services, such as e-mail, instant messaging, video conferencing, web blog and online shopping have become a way of life for more and more people. The internet brings opportunities, developments, and promotion, but at the same time, it brings the ravaging spread of harmful information, such as pornography, gambling, drugs, violence, terror, etc. The harmful information will do harm to internet users, especially minors. Therefore, harmful information filtering technique has become an urgent issue need to be addressed. In this thesis, we focus on four issues: image spam detection, human skin tone detection, pornography image detection, pornography film and streaming media detection. The main contributions of this thesis are summarized as follows: 1) An algorithm for filtering image spams by using global invariant features is proposed. Fourier-Mellin transform is a translation, scaling and rotation invariant function, and is effective for fighting most image spam variants. First, we extract the Fourier-Mellin invariant features from the input images. The resulting Fourier-Mellin feature matrix is then stretched into a 1D vector by row concatenation. The Principal Components Analysis (PCA) method is employed to mapping the high-dimensional feature space into low-dimensional feature space. At last, a one-class classi¯er, the support vector data description (SVDD), is trained to model the boundary of image spam class in the feature space, and is exploited to detect the new e-mail samples. 2) An algorithm for filtering image spams by using local invariant features is proposed. The MSER and SURF detectors are used to find interest points in each image, and SIFT, shape contexts, SURF descriptors are used to compose the feature sets. Then, a vocabulary-guided pyramid match kernel is employed to measure the similarity of two images and is used as the kernel function of SVDD. At last, the one-class SVDD classifier is trained to model the boundary of image spam class in the feature space, and is exploited to detect the new e-mail samples. 3) A patch-based skin color detection algorithm is proposed. Taking into account the speed and accuracy of skin detection, we propose two methods to generate small patches. One is regular patch generation method based on non-overlapping sliding rectangle windows, the other is irregular patch generation method based on com... |
修改评论