当前位置:主页 > 科技论文 > 物理论文 >

基于二分搜索结合修剪随机森林的特征选择算法在近红外光谱分类中的应用

发布时间:2018-12-23 09:26
【摘要】:针对随机森林(RF)在高维空间特征选择过程中计算繁琐和内存开销大、分类准确率低等问题,提出了基于二分搜索(BS)结合修剪随机森林(RFP)的特征选择算法(BSRFP);该算法首先根据纯度基尼指数获取特征重要性评分,删除重要性评分较低的特征,然后利用BS算法结合基分类器差异性的修剪技术得到最优特征子集和最高分类准确率的分类器;为了验证算法的有效性,构建卷烟质量识别模型并与其他方法进行比较。结果表明:BS算法简化了特征搜索过程,RFP算法缩减了RF算法的规模;RFP算法的分类准确率可达96.47%;BSRFP算法选择出的特征相关性更强,对卷烟质量识别具有更高的准确度。
[Abstract]:In order to solve the problems of complicated computation, large memory cost and low classification accuracy of random forest (RF) in the process of high dimensional spatial feature selection, a feature selection algorithm (BSRFP); based on binary search (BS) combined with pruning random forest (RFP) is proposed. The algorithm firstly obtains the feature importance score according to the purity Gini index and removes the feature with the lower importance score. Then the BS algorithm combined with the pruning technique of base classifier difference is used to obtain the optimal feature subset and the classifier with the highest classification accuracy. In order to verify the effectiveness of the algorithm, a cigarette quality recognition model was constructed and compared with other methods. The results show that the BS algorithm simplifies the feature search process, the RFP algorithm reduces the scale of the RF algorithm, and the classification accuracy of the RFP algorithm reaches 96.47. The feature correlation selected by BSRFP algorithm is stronger and has higher accuracy for cigarette quality recognition.
【作者单位】: 中国海洋大学信息科学与工程学院;云南中烟工业有限责任公司技术中心;
【基金】:国家科技支撑计划(2015BAF12B01) 云南中烟工业有限责任公司项目(JSZX2014YL01,20530001020152000086)
【分类号】:O433.4

【相似文献】

相关硕士学位论文 前1条

1 闫西章;近红外无创血糖检测的随机森林模型及实验系统的设计[D];吉林大学;2014年



本文编号:2389772

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/wulilw/2389772.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户717bb***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com