The statistical learning theory has theoretical guaranteed performance results,and is widely used in low dimensionality data sets. According to the enlarged application need, structured data and some complex data set with new features emerge in an endless stream. It becomes seriously to apply the statistical learning theory in these application scopes. Multiple instance learning is a newly emerged region in machine learning. It has received increasing amount of research interest in machine learning recent years for its wide applications in image classification, text categorization, computer security, etc. Unlike supervised learning, in MIL, only the labels of bags are known, the instance labels in positive bags are not available. Many algorithms make the assumption that the instances in the bags are i.i.d samples, but this may not true in practical applications. In this paper, we treat the negative instances in the positive bag as pairwise partners of the positive instances, by using this correlation information, efficient feature mapping is built to re-describe the bag. Experiment results show that this description is efficient in real world applications. The standard support vector machine (SVM) is celebrated for its theoretically guaranteed generalization performance. However, it lacks sparse and thus cannot be used for feature extraction. Zero norm SVM is ideal in the sense of sparsity while its optimization is prohibitive due to the combinatorial nature of zero norm. In this paper, 1 norm and infinite norm constraints are employed simultaneously to relax the zero norm while keep its sparsity. The resulted constraint regions possess much more sparse vertices than that of the 1 norm. Generally, the more sparse vertices the constraint regions have, the sparser the solution will be. Therefore, more parsimonious model can be obtained via the combination of 1 and infinite norm. Interestingly enough, although infinite norm alone does not lead to sparse results, it helps to enhance the sparsity of 1 norm regularization. The optimal solution has a favorable piecewise linearity property, based on which the whole solution path can be obtained, and this greatly facilitates model selection. The strict proof for piecewise linearity is given in this paper. Experimental results demonstrate that our approach offers comparable prediction accuracy with significantly higher sparsity.
修改评论