英文摘要 | Infrared spectroscopy is a non-destructive, non-polluting and quick detection method, and it has been widely used in offline/online industrial production.However, many practical spectral data has more features than the number of samples, traditional machine learning algorithms will encounter small sample size problem. Another, the online detection system is based on some of the existing examples, but for some special applications, such as: analysis of liquor is concerned, from food-growing to ferment and then to sample collection, even using infrared spectroscopy, the cycle of data obtained is relatively long, and to collect enough samples also becomes more difficult. Therefore, we need to solve how to establish a model with best prediction and generalization ability in the case of a few samples with labels and other samples without labels. In this context, this paper research on some machine learning algorithms about infrared spectrum. Based on partial least squares algorithm, we proposed several meaningful algorithms which has important value and practical significance for data dimensionality reduction and analysis. There are three main innovations of this paper: Based on the basic principles of linear discriminant analysis and partial least squares algorithm, we found that the projection direction of LDA is indeed the optimal result in the assumption of Gaussian distribution. When the center of each type of samples is determined, and then the rojection direction of LDA is also determined (this can be known from the basic principle of LDA). Currently, if we add some samples which do not affect the class center but affect the shape and distribution of the dataset, the projection direction of LDA is not the optimal. Herein we combine the Partial Least Squares (PLS) method with LDA algorithm, and then propose two improved methods, named LDA-PLS and ex-LDA-PLS, respectively. The LDA-PLS amends the projection direction of LDA by using the information of PLS, while ex-LDA-PLS is an extension of LDA-PLS by combining the result of LDA-PLS and LDA, making the result closer to the optimaldirection by an adjusting parameter. Comparative studies are provided between the proposed methods and other traditional dimension reduction methods such as Principal component analysis (PCA), LDA and PLS-LDA. Experimental results show that the proposed method can achieve better classification performance. For the problem of multiple correlations between features, we applie... |
修改评论