改进的LMS-KNN近邻分类方法研究
[Abstract]:As one of the classical machine learning algorithms, the nearest neighbor classification algorithm is suitable for multi-classification problems because it does not need to estimate parameters and is easy to implement. In recent years, it has been widely used in advertising, chat robot, network security, medical care, etc. Marketing planning and other fields have been widely used. Among them, the nearest neighbor classification algorithm based on local mean and class means, (Nearest neighbor classification based on local mean and class mean-LMS-KNN, is an improved algorithm for K-nearest neighbor classification (K-nearest neighbor classification) is insensitive to outliers and does not use global information of samples). Although the improved algorithm improves the classification accuracy and classification efficiency, it still has some drawbacks. The unbalance of data will affect the classification accuracy of LMS-KNN. At the same time, the algorithm involves the setting of many parameters, such as the selection of nearest neighbor value K, the determination of weight value, the selection of distance measure and so on. Therefore, in order to further improve the classification accuracy of the LMS-KNN algorithm, the following research work: 1) summarizes and analyzes several commonly used nearest neighbor classification methods and local mean and class mean nearest neighbor classification algorithms. In this paper, the principles, advantages and disadvantages of their algorithms are compared, and several optimization algorithms used in this paper are briefly introduced. In view of the effect of unbalanced data on LMS-KNN classification accuracy, the iterative nearest neighbor oversampling algorithm is used to preprocess the data. After processing the approximate equilibrium data set, the semi-supervised local mean and class mean are used to classify the parameters of the LMS-KNN classification algorithm. The cross-validation and the traditional iterative algorithm are used to determine the parameters of the LMS-KNN classification algorithm. In this paper, the cross-validation error of the classification algorithm is first modeled. Then the weight of the class mean vector is determined as a mathematical formula based on objective decision information, and the weighted weight is selected by the uniform iterative method of step size optimization. In order to optimize the parameter determination of LMS-KNN classification algorithm, genetic algorithm (Genetic Algorithm) can solve the nonlinearity without depending on the specific domain of the problem in order to optimize the parameter determination of LMS-KNN classification algorithm by improving the classification accuracy and classification efficiency of the traditional algorithm under the condition of balancing the subjective and objective decision rules. In this paper, a local mean and class mean nearest neighbor classification algorithm based on genetic algorithm is proposed. The weight of class mean is selected as initial population, and the classification error is used as evaluation function. The best class mean weight is selected by genetic iteration, and compared with the traditional KNNN LM-KNN (A local mean based nonparametric classifier) and LMS-KNN algorithm, the experimental results show that this method can effectively search the appropriate feature weights on the UCI dataset and obtain better classification accuracy.
【学位授予单位】:电子科技大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP181
【参考文献】
相关期刊论文 前10条
1 古平;杨炀;;面向不均衡数据集中少数类细分的过采样算法[J];计算机工程;2017年02期
2 李彦冬;郝宗波;雷航;;卷积神经网络研究综述[J];计算机应用;2016年09期
3 曾勇;舒欢;胡江平;葛月月;;基于BP神经网络的自适应伪最近邻分类[J];电子与信息学报;2016年11期
4 安波;;人工智能与博弈论——从阿尔法围棋谈起[J];中国发展观察;2016年06期
5 文志诚;陈志刚;;基于隐马尔可夫模型的网络安全态势预测方法[J];中南大学学报(自然科学版);2015年10期
6 崔承刚;杨晓飞;;基于内部罚函数的进化算法求解约束优化问题[J];软件学报;2015年07期
7 蒋卓人;陈燕;高良才;汤帜;刘晓钟;;一种结合有监督学习的动态主题模型[J];北京大学学报(自然科学版);2015年02期
8 孟子健;马江洪;;一种可选初始聚类中心的改进k均值算法[J];统计与决策;2014年12期
9 李知艺;丁剑鹰;吴迪;文福拴;;步长优化技术在交直流系统潮流计算中的应用研究[J];华北电力大学学报(自然科学版);2014年03期
10 王秀岩;;决策树算法及其应用[J];电子技术与软件工程;2014年05期
相关博士学位论文 前2条
1 于文华;数学问题解决中模式识别的影响因素研究[D];南京师范大学;2012年
2 向晓林;非线性代数方程组与几何约束问题求解[D];四川大学;2003年
相关硕士学位论文 前6条
1 樊存佳;基于CHI和KNN的文本特征选择与分类算法的研究[D];北京工业大学;2016年
2 岳永鹏;深度无监督学习算法研究[D];西南石油大学;2015年
3 俞闯;半监督学习中不平衡数据集分类研究[D];大连理工大学;2015年
4 李俊平;人工智能技术的伦理问题及其对策研究[D];武汉理工大学;2013年
5 徐晓艳;基于K近邻算法的中文文本分类研究[D];安徽大学;2012年
6 林丽;基于语义距离的文本聚类算法研究[D];厦门大学;2007年
,本文编号:2185147
本文链接:https://www.wllwen.com/kejilunwen/zidonghuakongzhilunwen/2185147.html