In recent years, research on the image based object classification has made rapid progress. With the sharp increase of the training data and computing resource, methods such as deep learning have made some breakthrough in the large-scale object classification. These methods are usually based on the assumption that the data is sufficient and the distribution between different classes is relatively balanced. However, in most cases, the object classification faces the problem of imbalanced data distribution and small data in one class. Under this circumstance, the machine learning approach based on large-scale data assuming data balance will lose its power. This thesis focuses on the problem of imbalanced data and small data, and explores effective solutions from three perspectives: feature representation learning, sample reconstruction for small data class, and classifier design. The main work and contributions are summarized as follows: 1.Representation learning using cluster-based linear discriminative analysis An objective classification system is usually composed of three steps: training sample collection, feature representation and classifier design. The existing solutions concerning the small data problem focus mostly on training data collection and classifier design, while applying classical feature representation approaches. Feature design is a key factor in classification. In order to learn the appropriate feature representation for the small data problem, this thesis proposes to learn discriminative feature representations which adapt to data distribution. This method applies the cluster-based linear discriminative analysis, and the optimization model takes the sample distribution between classes into account, so that it can learn discriminative features for small data class even though there are only few samples in this class. Extensive experiments demonstrate that, the proposed approach can significantly improve the performance of objectclassification in the case of imbalanced data and small data. 2.Subspace-based sample reconstruction method Sample reconstruction is a common method to solve data imbalance problem. Studies show that both over-sampling and under-sampling can effectively improve the performance of imbalance data classification. However, over-sampling and under-sampling just replicate some samples of the small data class or remove some samples of other classes. These kinds of methods have not made good use of prior knowledge t...
修改评论