基于近红外光谱技术的淀粉含水量预测
本文选题:近红外光谱技术 + 蒙特卡罗交叉验证 ; 参考:《北方工业大学》2017年硕士论文
【摘要】:淀粉品质的优劣与其水分含量的多少关系密切。传统的检测技术虽然能够达到预测目的,但是常常伴有耗时长,效率低,准确性差,破坏待测样本等诸多缺点。因此,研究一种快速、方便、准确、无损的检测方法具有十分重要的意义。本文以淀粉为研究对象,利用近红外光谱技术和化学计量学分析方法,开展淀粉含水量检测技术的研究。先对所采集的淀粉样本做了光谱预处理、提取最佳主成分数、识别和剔除奇异样本等前期准备工作,然后建立定量校正模型,最终实现了对淀粉含水量的预测。本文运用偏最小二乘回归(Partial Least Squares Regression,PLSR)建立淀粉含水量的预测模型,由于建模前期的数据准备工作对模型的稳定性和预测能力影响较大,本文的研究重点主要放在建模前的数据处理中,包括选择光谱预处理方法、提取最佳主成分、识别与验证校正集中的奇异样本等工作。论文通过对比不同预处理方法下模型的评价指标的优劣,选择了适合淀粉样本的标准正态变量变换(SNV)的光谱预处理方法;而参与建模的最佳主成分数的确定则利用交互验证均方差方法进行选择。其中,对奇异样本的识别设计了一种基于蒙特卡罗交叉验证(Monte Carlo cross validation,MCCV)的检测方法,该方法基于蒙特卡罗的概率统计思想,建立大量的PLSR模型,得到所有校正集样本的预测残差的均值和方差,做出均值-方差分布图,把位于高均值或高标准差区域的样本暂定为可疑样本。然后利用三组对比实验并结合t检验对识别出的奇异样本做验证。对比实验分别是:保留可疑样本建模、剔除可疑样本建模、随机剔除与可疑样本数目相同的样本建模。先记录各实验中模型的评价指标,再运用t检验方法对指标间的差异进行分析,由t检验结果判断差异是否具有显著性。若存在显著性差异,则说明识别出的可疑样本是奇异样本;反之,则说明该可疑样本是非奇异样本。本文用蒙特卡罗交叉验证法对参与建模的100个淀粉样本进行奇异样本的识别和验证,成功筛查出了其中的奇异样本,证明了该方法的可靠性。然后,用经过前期处理的淀粉样本建立了含水量预测模型,并用测试集样本对模型进行了验证,通过对含水量的预测值和实际值的分析,证明了预测模型的可行性。故把近红外光谱技术应用到淀粉含水量预测系统的设计中。软件设计用到Matlab和SPSS。Matlab调用Excel表中的数据,用于分析和建模,并实现对仿真结果的展示;SPSS实现对数据的统计和分析。
[Abstract]:The quality of starch is closely related to the amount of water content. Although the traditional testing technology can achieve the purpose of prediction, it is often accompanied by long time, low efficiency, poor accuracy, and many disadvantages. Therefore, it is of great significance to study a rapid, convenient, accurate and nondestructive testing method. Powder is used as the research object, using near infrared spectroscopy and chemometrics analysis to carry out the research on the detection technology of starch moisture content. First, we pretreated the sample starch samples, extracted the best main fraction, identified and eliminated the strange sample preparation work, then established the quantitative correction model, and finally realized the lake. Prediction of the water content of powder. This paper uses partial least squares regression (Partial Least Squares Regression, PLSR) to establish the prediction model of water content in starch. Because the data preparation work in the early stage of modeling has great influence on the stability and prediction ability of the model, the main focus of this paper is to be placed in the data processing before modeling, including the selection of spectrum. Preprocessing method, extracting the best principal component, identifying and validating the singular samples of the calibration center. By comparing the evaluation indexes of the model under different preprocessing methods, the paper selects the standard normal variable transformation (SNV) method suitable for the starch sample, and the optimum main fraction of the model is determined by the method. In this method, a detection method based on Monte Carlo cross validation (MCCV) is designed for the identification of singular samples. The method is based on the probability and statistics idea of Monte Carlo and establishes a large number of PLSR modes, and the mean sum of the predicted residual of all correction set samples is obtained. Variance, make the mean variance distribution map, set the sample in the high mean or high standard deviation region as suspicious sample. Then use three groups of contrast experiments and t test to verify the identified singular samples. Contrast experiments are: retaining suspicious sample modeling, eliminating suspicious sample modeling, random elimination of the number of suspicious samples The same sample model is used to record the evaluation indexes of the models in each experiment, and then the t test method is used to analyze the differences between the indexes. The t test results are used to determine whether the difference is significant. If there is a significant difference, the identified suspicious samples are singular samples, and the inverse, the suspicious sample is a nonsingular sample. The Monte Carlo cross validation method is used to identify and verify the singular samples of 100 starch samples involved in the modeling. The singular samples are successfully screened and the reliability of the method is proved. Then, the prediction model of water content is established with the starch samples treated by the previous period, and the model is verified by the test set sample. After the analysis of the prediction value and actual value of water content, the feasibility of the prediction model is proved. Therefore, the near infrared spectroscopy is applied to the design of the prediction system of starch water content. The software design uses Matlab and SPSS.Matlab to call the data in the Excel table, for analysis and modeling, and to the display of the simulation results; the SPSS realizes the data. Statistics and analysis.
【学位授予单位】:北方工业大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TS237;O657.33
【参考文献】
相关期刊论文 前10条
1 宋相中;唐果;张录达;熊艳梅;闵顺耕;;近红外光谱分析中的变量选择算法研究进展[J];光谱学与光谱分析;2017年04期
2 王旭朝;郝中骐;郭连波;李祥友;曾晓雁;陆永枫;;显微激光诱导击穿光谱技术对低合金钢中Mn的定量检测[J];光谱学与光谱分析;2017年04期
3 张伊挺;王翠翠;樊梦丽;蔡文生;邵学广;;基于便携式近红外光谱仪的重金属离子定量分析研究[J];光谱学与光谱分析;2016年12期
4 王海霞;所同川;余河水;李正;;基于近红外光谱技术的甘草提取过程最优建模方法研究[J];中国中药杂志;2016年19期
5 李正风;徐广晋;王家俊;杜国荣;蔡文生;邵学广;;模型诊断用于近红外光谱建模校正集中奇异样本的识别[J];分析化学;2016年02期
6 林永忠;李丽娜;林添良;;血糖近红外光谱分析中奇异样本去除方法研究[J];生物医学工程学杂志;2015年06期
7 王智宏;刘杰;王婧茹;孙玉洋;于永;林君;;数据预处理方法对油页岩含油率近红外光谱分析的影响[J];吉林大学学报(工学版);2013年04期
8 孙晓荣;刘翠玲;吴静珠;董秀丽;吴胜男;;基于近红外光谱的淀粉含水量快速检测研究[J];食品工业科技;2011年10期
9 李华;王菊香;郭恒光;陶杨;刘洁;;光谱预处理方法对混胺近红外定量模型影响的研究[J];分析科学学报;2010年05期
10 柳艳云;胡昌勤;;近红外分析中光谱波长选择方法进展与应用[J];药物分析杂志;2010年05期
相关博士学位论文 前1条
1 林志丹;基于可见/近红外光谱分析的化肥土壤成分速测模型研究[D];中国科学技术大学;2016年
相关硕士学位论文 前9条
1 陈国权;近红外光谱技术在感冒灵颗粒生产过程质量控制中的应用研究[D];浙江大学;2017年
2 常云彩;常见食用淀粉特性及掺假检测方法研究[D];河南工业大学;2015年
3 韩明;基于近红外光谱技术食品检测软件开发及其应用研究[D];电子科技大学;2013年
4 徐彦;近红外光谱技术快速检测籼稻主要成分及新陈度的研究[D];中南林业科技大学;2011年
5 张建慧;近红外光谱分析法在烟草生产中的应用研究[D];河南农业大学;2010年
6 杨琼;近红外光谱法定量分析及其应用研究[D];西南大学;2009年
7 马兰;基于近红外光谱检测番茄内部品质的研究[D];华中农业大学;2008年
8 吴继明;模型转移技术在近红外光谱仪器上的应用研究[D];江苏大学;2007年
9 李冰;近红外光谱法用于丹参中原儿茶醛含量定量预测研究[D];吉林大学;2004年
,本文编号:1894654
本文链接:https://www.wllwen.com/kejilunwen/huaxue/1894654.html