CASIA OpenIR  > 智能制造技术与系统研究中心  > 多维数据分析
红外光谱的预处理和定量算法研究
吴义凡
Subtype博士
Thesis Advisor彭思龙
2019-06-03
Degree Grantor中国科学院自动化研究所
Place of Conferral中国科学院自动化研究所
Degree Discipline模式识别与智能系统
Keyword红外光谱 定量分析 散射校正 变量选择 非线性最小二乘
Abstract

红外光谱技术(包括中红外和近红外)是一种快速、无损、无污染的检测技术,非常适于实时在线应用。红外光谱能够反映混合物的化学性质,基于红外光谱的定量分析技术已广泛用于食品、药品、石油化工等行业。时至今日,红外光谱定量分析正在由简单混合物分析转而复杂混合物分析,由单组分分析转而多组分同时分析,由实验室环境转向户外环境。从而造成定量分析所面临的难度更加突出,需要处理的问题也更加复杂。在此背景下,本论文主要针对复杂混合物系的非线性现象及信号干扰问题,研究了对应场景下的光谱信号改善算法及定量建模方法,对于红外光谱技术的应用和推广具有重要科学价值和实际意义。
(1)研究了非线性复杂混合物体系的定量建模问题,提出一种基于局部多项式插值的非线性最小二乘算法。对于化学成分完全清楚的混合物,各成分的红外光谱能够量测得到,成为已知信息。所提算法假设纯成分的红外光谱是浓度的非线性函数,使用最小二乘作为目标函数求解混合光谱的待测浓度。由于各成分的光谱已知,本文采用局部多项式插值法计算求解时需要的梯度信息等。烃类混合气体的实验表明,所提算法的预测精度明显优于传统的线性模型。
(2)研究了如何去除红外谱图在采集过程的噪声和干扰问题,提出一种基于变量选择的加权散射校正算法。使用变量选择可以自动计算出加权散射校正的最优权值,避免化学信息强的变量对散射参数估计的影响。两个公共数据的实验结果表明,所提算法的定量预测表现优于经典散射校正,且得到的变量具有较好的可解释性。
(3)研究了多模型融合策略问题,提出了一种基于切片变换的模型融合权值计算方法(SLT法),该方法用于堆叠滑窗偏最小二乘(SMWPLS)的融合权值计算。SMWPLS是一种多模型集成的方法,在定量分析中的变量选择问题上有良好效果。SMWPLS的关键问题是如何计算融合权值,常用的方法有交叉验证(CV)法和非负最小二乘(NNLS)法。所提的SLT法利用切片变换来实现分段线性映射,以CV法得到的融合权值作为输入,用最小二乘进行参数估计,相当于结合了CV法和NNLS法的优点。两个公共数据的实验结果表明,在不同的集合大小下,SLT法的定量预测结果要好于或者不差于CV法和NNLS法。

Other Abstract

The infrared techniques, including mid-infrared and near-infrared technique, are fast, non-destructive and  non-polluting for detection. Therefore, they are ideal for real-time online applications. Infrared spectroscopy reflects the chemical properties of the mixture. Therefore, quantitative analysis based on infrared spectroscopy has been widely used in food, pharmaceutical, petrochemical and other industries. Today, the focus of infrared techniques is shifting from simple mixture analysis to complex mixture analysis, from one-component analysis to multi-component simultaneous analysis, from a laboratory environment to an outdoor environment. As a result, the difficulty of quantitative analysis is more prominent, and the relevant problems are more complicated. Therefore, this thesis focuses on the nonlinear phenomena and signal interference of complex mixtures, and studies the signal improvement algorithms and quantitative modeling methods, which has important scientific value and practical significance for the application and promotion of infrared spectroscopy.
The quantitative modeling problem of complex mixture systems with nonlinear components is studied. A nonlinear least squares algorithm based on local polynomial interpolation is proposed. For a mixture of completely clear chemical components, the infrared spectrum of each component can be measured and used as known information. The proposed algorithm assumes that the infrared spectrum of the pure component is a nonlinear function of the concentration, and the least squares is used as the objective function to solve the concentration of the mixed spectrum. Since the spectra of the pure components are known, the local polynomial interpolation method is used to calculate the gradient information needed for the solution. Experiments on hydrocarbon mixed gas showed that the prediction accuracy of the proposed algorithm was significantly better than the traditional linear model.
Noise and interference signal in the acquisition process of infrared spectrum is studied. A weighted scatter correction algorithm based on variable selection is proposed. By setting the weights of variables with strong chemical information to zeros, their influence can be avoided during the scattering parameter estimation. Using the variable selection, the optimal weights of the weighted scatter correction can be obtained automatically. The experimental results of two public datasets showed that the quantitative prediction performance of the proposed algorithm was better than the classical scatter correction methods, and the obtained variables had better interpretability.
The fusion strategy of multi-model analysis is studied. A novel method based on slice transformation (SLT) for combination weights calculation is proposed. The SLT method is used for stacked moving window partial least squares (SMWPLS). SMWPLS is an ensemble learning method which has a well performance on variable selection problem of quantitative analysis. The key of SMWPLS is how to calculate the combination weights. Common methods are the cross-validation (CV) method and the non-negative least squares (NNLS) method. The proposed SLT method uses slice transformation to perform piecewise linear mapping where the combination weights obtained by CV method is taken as input, and the least squares is used for parameter estimation. Therefore, the SLT method combines both the advantages of CV method and NNLS method in some way. The experimental results of two public datasets showed that the quantitative prediction results of the SLT method were better or not worse than the CV method and the NNLS method under different ensemble sizes.

Pages86
Language中文
Document Type学位论文
Identifierhttp://ir.ia.ac.cn/handle/173211/23814
Collection智能制造技术与系统研究中心_多维数据分析
Recommended Citation
GB/T 7714
吴义凡. 红外光谱的预处理和定量算法研究[D]. 中国科学院自动化研究所. 中国科学院自动化研究所,2019.
Files in This Item:
File Name/Size DocType Version Access License
吴义凡博士论文.pdf(6244KB)学位论文 开放获取CC BY-NC-SAApplication Full Text
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[吴义凡]'s Articles
Baidu academic
Similar articles in Baidu academic
[吴义凡]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[吴义凡]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.