基于机器学习算法在数据分类中的应用研究
发布时间:2018-02-15 07:12
本文关键词: 树叶分类 支持向量机 粒子群算法 主成分分析法 癌症分类 卷积神经网络 出处:《中北大学》2017年硕士论文 论文类型:学位论文
【摘要】:现实中的很多实际问题都可以转化为数据信息处理中的数据分类问题,例如气象预报、商品推荐、生物信息、网络检测等,而数据信息处理都是以机器学习为基础进行研究的。随着科学技术的发展,机器学习算法的应用领域也变得十分广泛。本文主要介绍了两种机器学习算法:粒子群算法优化支持向量机和卷积神经网络。其中研究了粒子群算法优化支持向量机在树叶分类和癌症基因分类中的预测,卷积神经网络在图像分类中的应用。(1)基于各种树叶的特征构建一个数据预处理模型:先对各种数据进行归一化处理,采用主成分分析方法从16个特征中提取出3个主成分,再建立粒子群算法优化后的支持向量机,用支持向量机对树叶数据进行分类预测。实验结果表明,相对于遗传算法和网格搜索法寻到的最优参数相比,粒子群算法优化支持向量机具有最高的准确率,高达94.1%,高于其他两种分类方法。(2)将粒子群优化的支持向量机模型应用到癌症基因分类中,通过选取多组不同的实验数据对癌症手术后病人的复发和不复发的基因样本进行预测分类。对于三种不同分类方法对于癌症基因分类的不同分类效果,综合实验结果,粒子群优化支持向量机在三种分类方法中达到最好的分类效果。(3)将卷积神经网络应用到图像处理上,通过优化卷积神经网络卷积层和池化层中的滤波器函数,达到了优化性能的作用,再构造一定结构的卷积神经网络,然后将该模型对图像数据集进行分类处理,在对图像进行最后达到预期的分类结果。
[Abstract]:Many practical problems in reality can be transformed into data classification problems in data information processing, such as weather forecast, commodity recommendation, biological information, network detection, etc. And data processing is based on machine learning. With the development of science and technology, In this paper, we mainly introduce two kinds of machine learning algorithms: particle swarm optimization support vector machine and convolution neural network. The prediction of the measuring machine in leaf classification and cancer gene classification, The application of convolution neural network in image classification. (1) A data preprocessing model is constructed based on the characteristics of various leaves. Firstly, the data are normalized, and three principal components are extracted from 16 features by principal component analysis (PCA). Finally, the support vector machine (SVM) is established, which can be used to classify and predict the leaf data. The experimental results show that compared with the optimal parameters obtained by genetic algorithm and grid search, Particle swarm optimization support vector machine (SVM) has the highest accuracy, up to 94. 1%, which is higher than the other two classification methods. (2) the particle swarm optimization support vector machine model is applied to cancer gene classification. By selecting different groups of experimental data to predict and classify the recurrence and non-recurrence gene samples of patients with cancer after operation, three different classification methods for different classification effects of cancer gene classification were synthesized. Particle swarm optimization support vector machine achieves the best classification effect in three classification methods. The convolution neural network is applied to image processing. The filter functions in convolution layer and pool layer are optimized. The function of optimizing performance is achieved, and a convolution neural network with certain structure is constructed, then the image data set is classified by the model, and the expected classification result is achieved at the end of the image classification.
【学位授予单位】:中北大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP18;TP391.41
【参考文献】
相关期刊论文 前6条
1 Fei-Yue Wang;Jun Jason Zhang;Xinhu Zheng;Xiao Wang;Yong Yuan;Xiaoxiao Dai;Jie Zhang;Liuqing Yang;;Where Does AlphaGo Go: From Church-Turing Thesis to AlphaGo Thesis and Beyond[J];IEEE/CAA Journal of Automatica Sinica;2016年02期
2 徐姗姗;刘应安;徐f;;基于卷积神经网络的木材缺陷识别[J];山东大学学报(工学版);2013年02期
3 顾佳玲;彭宏京;;增长式卷积神经网络及其在人脸检测中的应用[J];系统仿真学报;2009年08期
4 鲍卫锋;黄介生;孔祥元;;基于主成分分析法的流域水循环效应[J];武汉大学学报(工学版);2007年02期
5 陈果;;基于遗传算法的支持向量机分类器模型参数优化[J];机械科学与技术;2007年03期
6 饶鲜,董春曦,杨绍全;基于支持向量机的入侵检测系统[J];软件学报;2003年04期
,本文编号:1512717
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/1512717.html