Due to its advantage of fast, integral, and nondestructive identification of complex mixtures, Fourier transform infrared spectroscopy has been widely used in various fields. However, because of the effects of spectrometer and changing environmental conditions, the measured FTIR spectra often contain undesirable baseline artifacts including baseline offset and baseline slope. These baselines hamper the interpretation of spectra and result in intensity deviations that are not due to a strict adherence to Beer’s law. Therefore, it is necessary to remove the drifted baseline and develop robust calibration model in spectroscopic analysis. In this sense, the research in this thesis focuses on the quality improvement of the spectra and contributes to improving the robustness of the quantitative analysis model. The content in this thesis includes the following aspects: Baseline correction algorithms are explored in the paper. Based on asymmetric least squares smoothing, a new algorithm for multiple spectra baseline correction (MSBC) is proposed. By the similarity among the multiple spectra, the algorithm estimates the baselines by penalizing the differences between the baseline corrected signals, which makes the algorithm possible to eliminate scatter effects on the spectra. In addition, a relaxation factor which measures the similarity of the baseline corrected spectra is incorporated into the optimization model and an alternate iteration strategy is used to solve the optimization problem. Experimental results on both simulated data and real data demonstrate the effectiveness and efficiency of the proposed algorithm. Based on random sample consensus, a novel outlier detection method in partial least squares is proposed. It models inlier error as unbiased Gaussian distribution and outlier error as uniform distribution. A point can be diagnosed as outlier when the inlier probability is smaller than the outlier probability, or the PLS residual is smaller than a given threshold based on the estimated standard deviation. The proposed algorithm repeatedly generates partial least squares solutions estimated from random samples and then tests each solution for the support from the complete dataset for consistency. A comparative study of the proposed method and leave-one-out cross validation on simulated data and real data of pharmaceutical tablets is presented. The proposed method is more sensitive to outliers and proved to be highly efficient. ...
修改评论