基于最小角回归与GA-PLS的NIR光谱变量选择方法
发布时间:2018-05-16 15:02
本文选题:近红外光谱 + 最小角回归 ; 参考:《光谱学与光谱分析》2017年06期
【摘要】:近红外(NIR)光谱一般具有较多的波长变量数,对其直接或间接地进行变量选择是提高模型稳定性能及预测性能的关键。最小角回归(LAR)是一种相对较新和有效的机器学习算法,常用于进行回归分析和变量选择。面向光谱建模应用,提出一种LAR结合遗传偏最小二乘法(GA-PLS)的变量选择方法,可有效筛选出少数特征波长点。首先在全光谱区利用LAR消除变量间的共线性得到初筛波长点,然后用GA-PLS对LAR筛选出的波长点进一步优选从而得到最终建模用的特征波长点。为验证本文方法的有效性,以药片和汽油的近红外光谱回归分析作为应用案例,对原光谱进行预处理后,采用该方法进行变量筛选,然后分别建模其中的活性成分含量和C10含量。结果显示,在这两个应用中,最终优化得到的特征波长点数均只需七个,而两者的预测决定系数R2p分别达到0.933 9和0.951 9,与全光谱、无信息变量消除法(UVE)和连续投影算法(SPA)等方法相比,特征波长点更少,同时R2p和预测均方根误差RMSEP值更优。因此,LAR结合GA-PLS,能有效地从近红外光谱中选择出信息变量从而减少建模波数,提高预测精度,拥有较好的模型解释性。该方法可为特定领域的专用光谱仪设计提供有效的波长筛选工具。
[Abstract]:NIR spectra generally have a large number of wavelength variables. The selection of NIR spectra directly or indirectly is the key to improve the stability and prediction performance of the model. Minimum angle regression algorithm is a relatively new and effective machine learning algorithm, which is often used for regression analysis and variable selection. A variable selection method based on LAR combined with genetic partial least square method (GA-PLS) is proposed for spectral modeling, which can effectively screen a few characteristic wavelength points. In the whole spectrum region, the initial wavelength points are obtained by using LAR to eliminate the collinearity between variables, and then the characteristic wavelength points for the final modeling are obtained by the further optimization of the wavelength points selected by LAR by using GA-PLS. In order to verify the effectiveness of this method, the near infrared spectral regression analysis of tablets and gasoline was used as an application case. After pretreatment of the original spectrum, the method was used to screen the variables. Then, the content of active components and the content of C _ (10) were modeled respectively. The results show that, in these two applications, the number of characteristic wavelength points obtained by the final optimization is only seven, and the predictive determination coefficients R2p of the two methods are 0.933 9 and 0.951 9, respectively, which are in agreement with the full spectrum. Compared with the continuous projection algorithm (spa) and without information variable elimination (UVEE), there are fewer characteristic wavelength points, and R2P and the RMSEP value of RMS error are better. Therefore, Lar combined with GA-PLS can effectively select information variables from NIR spectra, thus reducing modeling wavenumber, improving prediction accuracy and having better model interpretation. This method can provide an effective wavelength screening tool for the design of special spectrometers in specific fields.
【作者单位】: 桂林电子科技大学电子工程与自动化学院;北京邮电大学自动化学院;
【基金】:国家自然科学基金项目(21365008,61562013) 广西壮族自治区自然科学基金项目(2013GXNSFBA019279) 桂林电子科技大学研究生创新项目(GDYCSZ201474,GDYCSZ201478)资助
【分类号】:O657.33;R737.31
【相似文献】
相关期刊论文 前2条
1 吴迪;汪志平;何勇;周子立;;iPLS-SPA变量选择方法在螺旋藻粉无损检测中的应用[J];农业工程学报;2009年S2期
2 ;[J];;年期
相关会议论文 前3条
1 李洪东;梁逸曾;;高维数据变量选择新方法研究[A];中国化学会第27届学术年会第15分会场摘要集[C];2010年
2 徐登;范伟;梁逸曾;;紫外光谱结合变量选择和偏最小二乘回归同时测定水中重金属镉、锌、钴[A];中国化学会第29届学术年会摘要集——第19分会:化学信息学与化学计量学[C];2014年
3 梁逸曾;李洪东;许青松;曹东升;张志敏;;灰色化学建模与模型集群分析——兼论过拟合、稳健估计、变量选择与模型评价[A];中国化学会第27届学术年会第15分会场摘要集[C];2010年
相关博士学位论文 前2条
1 唐凯临;变量选择和变换的新方法研究[D];同济大学;2008年
2 唐丽娟;定量结构活性相关性研究与高维微阵列数据分析中的化学计量学新算法[D];湖南大学;2009年
相关硕士学位论文 前1条
1 龙旭霞;基于互信息的变量选择方法研究[D];中南大学;2013年
,本文编号:1897335
本文链接:https://www.wllwen.com/yixuelunwen/fuchankeerkelunwen/1897335.html
最近更新
教材专著