红外光谱定量分析算法研究

CASIA OpenIR > 毕业生 > 博士学位论文

	红外光谱定量分析算法研究
其他题名	Research on Infrared Spectroscopy Quantitative Analysis Algorithms
	彭江涛
	2011-05-31
学位类型	工学博士
中文摘要	傅里叶变换红外光谱技术由于具有快速、整体和无损鉴定复杂混合物体系等优点，已被广泛应用于石油化工、食品工业、制药工业和生物等相关领域。然而，采集得到的红外谱图常常易受外界条件的影响，如仪器漂移和基线干扰等，这些干扰因素会使谱图发生复杂的变化，甚至出现异常。如何在数据预处理阶段消除这些不必要的干扰，以及如何建立准确稳定的多元校正模型来有效地反映待测物质组分与观测光谱之间的真实关系，是提高红外光谱定量分析灵敏度的关键所在。在此背景下，本论文开展的光谱信号质量改善算法研究以及定量分析模型研究，对于傅里叶变换红外光谱技术的应用和推广具有重要科学价值和实际意义。论文的主要内容包括： 1.研究了谱图的基线干扰问题，提出了一种新的多谱图联合基线校正算法。首次使用多个谱图进行联合基线校正，通过合理地利用谱图之间的相似性，有效地消除了谱图中附带的漂移基线。同时，由于使用了联合正则化，多谱图基线校正算法还可以用于校正谱图间的散射效应。算法中的松弛因子可以反映出各谱图之间的差异。实验结果表明，多谱图基线校正所对应的定量模型预测能力更强。 2. 研究了随机采样一致集方法，提出了一种偏最小二乘回归结合随机采样一致集的异常样本点检测算法。该方法采用不同的分布来描述内点和外点，将内点误差表示为无偏高斯分布，外点误差看做一致分布。当一个点为内点的概率小于外点概率，或者回归残差大于一定阈值时，则判定该样本为异常样本。通过随机采样产生一序列偏最小二乘解，然后依据偏最小二乘回归残量来评价这些解，从而得到一致样本集。当数据集中样本足够丰富时，偏最小二乘随机采样一致集方法总能通过随机采样，剔除那些对模型影响较大的样本，修正回归模型。与留一交叉验证方法相比，该方法对异常点更加敏感、识别能力更强。 3. 研究了基线或多项式等干扰因素对偏最小二乘回归模型的影响，提出了基线校正结合偏最小二乘算法（BCCPLS）。偏最小二乘算法本质上仍是线性方法，非线性基线会影响PLS成分向量的计算结果，使定量分析模型更加复杂，缺乏稳健性。BCCPLS算法将基线校正嵌入到偏最小二乘算法的权重计算过程，使得求解得到的权重向量能够抵抗低阶多项式的干扰。首次尝试将基线校正与偏最小二乘结合为一个整体，直接在定量分析模型中消除基线的影响，衡量算法的性能。模拟实验和真实数据实验均表明BCCPLS算法能够有效地去除谱图上附加的干扰基线，并且可以提高PLS模型的预测能力。 4．研究了谱回归子空间学习方法，提出了一种基于谱回归的模型转移算法。模型转移的目的是消除不同条件或不同仪器采集的谱图间的差异。仪器自身的变化或采集条件的影响会导致采集谱图呈现复杂的变化。而且光谱数据本身维数非常高，其中通常存在共线性因素和噪声影响。谱回归方法将数据变换到一个能够保持数据最优局部结构的低维流形，可以有效地消除这些干扰因素的影响，并且能够揭示数据内在分布的本质结构。实验结果表明，与分段直接标准化方法相比，谱回归方法能够改善多元校正模型变换的性能。 5．系统介绍了傅里...
英文摘要	Due to its advantage of fast, integral, and nondestructive identification of complex mixtures, Fourier transform infrared spectroscopy has been widely used in various fields. However, because of the effects of spectrometer and changing environmental conditions, the measured FTIR spectra often contain undesirable baseline artifacts including baseline offset and baseline slope. These baselines hamper the interpretation of spectra and result in intensity deviations that are not due to a strict adherence to Beer’s law. Therefore, it is necessary to remove the drifted baseline and develop robust calibration model in spectroscopic analysis. In this sense, the research in this thesis focuses on the quality improvement of the spectra and contributes to improving the robustness of the quantitative analysis model. The content in this thesis includes the following aspects： Baseline correction algorithms are explored in the paper. Based on asymmetric least squares smoothing, a new algorithm for multiple spectra baseline correction (MSBC) is proposed. By the similarity among the multiple spectra, the algorithm estimates the baselines by penalizing the differences between the baseline corrected signals, which makes the algorithm possible to eliminate scatter effects on the spectra. In addition, a relaxation factor which measures the similarity of the baseline corrected spectra is incorporated into the optimization model and an alternate iteration strategy is used to solve the optimization problem. Experimental results on both simulated data and real data demonstrate the effectiveness and efﬁciency of the proposed algorithm. Based on random sample consensus, a novel outlier detection method in partial least squares is proposed. It models inlier error as unbiased Gaussian distribution and outlier error as uniform distribution. A point can be diagnosed as outlier when the inlier probability is smaller than the outlier probability, or the PLS residual is smaller than a given threshold based on the estimated standard deviation. The proposed algorithm repeatedly generates partial least squares solutions estimated from random samples and then tests each solution for the support from the complete dataset for consistency. A comparative study of the proposed method and leave-one-out cross validation on simulated data and real data of pharmaceutical tablets is presented. The proposed method is more sensitive to outliers and proved to be highly eﬃcient. ...
关键词	傅里叶变换红外光谱学基线校正多元校正偏最小二乘模型转移 Fourier Transform Infrared Spectroscopy Baseline Correction Multivariate Calibration Partial Least Squares Model Transfer
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/6377
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	彭江涛. 红外光谱定量分析算法研究[D]. 中国科学院自动化研究所. 中国科学院研究生院,2011.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
CASIA_20081801462805（5789KB）			暂不开放	CC BY-NC-SA