Feature extraction and classifier design are the most important parts in a pattern recognition system. When dealing with high dimensional data, one is faced with the “curse of dimensionality”, where classifiers will be invalid. At the same time, people can't apperceive and understand these high dimensional data intuitively. For the sake of classification, one needs to analyze the discrimination of the high dimensional data and gets the low dimensional features in favor of classification purpose. Traditional research on feature extraction mainly focuses on linear dimensionality reduction. In recent years, nonlinear dimensionality reduction methods based on manifold learning have received a great deal of attentions. However, there still exist many challenges in theories and techniques in this area. In this thesis, we study the methods of dimensionality reduction and classifier design which involve a lot of basic problems in nonlinear dimensionality reduction and sparse representation based classification, such as the selection of neighborhood parameter, the estimation of “intrinsic” dimensionality, the reduction of time cost, locally sparse representation, discrimination analysis, etc. The main contributions of this thesis include following issues: First, to the problem of high time costs of locally linear embedding (LLE) in finding neighbors and computing eigenvectors, a new method called clustering-based locally linear embedding (CLLE) is proposed. Depending on the combination of cluster and LLE, the proposed method not only reduces the time costs of LLE efficiently, but also preserves the “intrinsic” structure of the high dimensional data when embedded into a low dimensional space. Moreover, when classifying the embedded data, CLLE can receive comparative or even better results than LLE. Second, those LLE-like methods which reduce dimensionality based on neighborhood preserving, lack efficient ways to select neighborhood parameter. For this problem, a method called self-regulation of neighborhood parameter for locally linear embedding (Self-regulated LLE) is introduced. It seeks to solve the problem LLE encountered by finding the local patch which is close to be a linear one. The experiment results show that Self-regulated LLE performs better than LLE in most cases based on different evaluation criteria and spends less time on several data sets. Third, the methods like LLE preserving the neighborhood of high dimensional data, may confront tw...
修改评论