基于粒子群优化的聚类分析三个关键问题研究
发布时间:2018-06-19 16:57
本文选题:聚类分析 + 粒子群优化 ; 参考:《南昌大学》2017年硕士论文
【摘要】:目前,粒子群优化算法已广泛应用于模式识别、垃圾邮件检测、数据聚类、机器人技术、推荐系统等很多领域。然而,在不同的应用背景下,传统的粒子群优化算法在有效性验证、速度位置更新规则、收敛性能等方面仍存在急需深入解决的问题。因此,本文针对聚类有效性指标、聚类算法以及复杂社团检测应用场景三个关键问题,提出动态终止聚类过程的聚类有效性指标,着重研究基于粒子群的聚类分析算法及复杂网络社团检测算法。本文主要研究工作如下:1、根据本文提出的多种聚类度量,提出了一种动态确定最佳聚类数的有效性评估方法,该方法采用本文提出的有效性指标——距离平方和差值比RDSED。根据之前提出的距离平方和差值DSED来计算RDSED值,并动态终止最佳聚类数搜索过程。人工数据集和真实数据集上的实验结果表明本章提出的RDSED指标和方法,能够有效地评估聚类划分结果并确定最佳聚类数。2、研究提出了一种基于PSO和K均值的混合聚类算法KIPSO,与传统粒子编码方案不同,KIPSO算法使用一种简约粒子编码方案,同时对数据进行预处理,采用数据对象与类簇中心的平均距离作为适应度函数。算法融合了PSO算法和K均值算法,具有PSO较强的全局寻优能力,又有K均值的局部搜索能力。人工和真实数据集的实验结果表明,该方法更加精确并有更好的收敛性能。3、提出了一种基于进化策略的离散粒子群复杂网络社团检测算法,该算法重新定义了粒子的速度位置和更新方式等,并采用了避免陷入局部最优的两种进化策略。GN基准网络数据集和真实网络数据集上的实验证明该算法能够有效发现网络社团,具有稳定的社团划分质量和全局收敛性。本文研究贡献:从分离性度量和紧密性度量等方面衡量聚类有效性验证过程中各指标相异性,并动态终止验证过程;对传统基于PSO的聚类算法进行优化,定义新型离散应用场景下的基于PSO的复杂网络社团检测算法。并通过多组实验验证了所提方法和算法是有效可行的。
[Abstract]:At present, particle swarm optimization (PSO) has been widely used in many fields, such as pattern recognition, spam detection, data clustering, robot technology, recommendation system and so on. However, in different application backgrounds, the traditional particle swarm optimization (PSO) still needs to be solved in the aspects of validity verification, velocity position updating rule, convergence performance and so on. Therefore, aiming at the three key problems of clustering validity index, clustering algorithm and application scene of complex community detection, this paper proposes the clustering validity index of dynamic termination clustering process. The cluster analysis algorithm based on particle swarm optimization and the community detection algorithm of complex network are studied. The main work of this paper is as follows: 1. According to the various clustering measures proposed in this paper, a new method for evaluating the effectiveness of dynamic determination of optimal clustering number is proposed. This method uses RDSED, an effective index proposed in this paper, which is called RDSED. The RDSED value is calculated according to the distance square sum difference DSED, and the optimal clustering number search process is dynamically terminated. The experimental results on artificial data set and real data set show that the RDSED index and method proposed in this chapter, A hybrid clustering algorithm based on PSO and K-means, KIPSO, is proposed, which is different from the traditional particle coding scheme and uses a reduced particle coding scheme. At the same time, the data is preprocessed and the average distance between the data object and the cluster center is used as the fitness function. The algorithm combines the PSO algorithm and the K-means algorithm, which has the strong global optimization ability of PSO and the local search ability of K-means. The experimental results of artificial and real data sets show that the proposed method is more accurate and has better convergence performance .3. an evolutionary strategy based community detection algorithm for discrete particle swarm complex networks is proposed. The algorithm redefines the velocity position and update mode of particles, and adopts two evolutionary strategies. GN benchmark network data set and real network data set to avoid falling into local optimum. The experiments show that the algorithm can effectively find network communities. It has stable community partition quality and global convergence. The contribution of this paper is to measure the different indexes in the validation process of clustering validity from the aspects of separation metric and compactness measure, and to dynamically terminate the verification process, and optimize the traditional clustering algorithm based on PSO. A new algorithm of community detection based on PSO for discrete applications is defined. The method and algorithm are proved to be effective and feasible through many experiments.
【学位授予单位】:南昌大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP18;TP311.13
【参考文献】
相关期刊论文 前10条
1 伍育红;;聚类算法综述[J];计算机科学;2015年S1期
2 邱晓辉;陈羽中;;一种面向社会网络社区发现的改进粒子群优化算法[J];小型微型计算机系统;2014年06期
3 何清;李宁;罗文娟;史忠植;;大数据下的机器学习算法综述[J];模式识别与人工智能;2014年04期
4 张长水;;机器学习面临的挑战[J];中国科学:信息科学;2013年12期
5 龚尚福;陈婉璐;贾澎涛;;层次聚类社区发现算法的研究[J];计算机应用研究;2013年11期
6 王李冬;魏宝刚;袁杰;;基于概率主题模型的文档聚类[J];电子学报;2012年11期
7 李国杰;程学旗;;大数据研究:未来科技及经济社会发展的重大战略领域——大数据的研究现状与科学思考[J];中国科学院院刊;2012年06期
8 王韶;周鑫;;应用层次聚类法和蚁群算法的配电网无功优化[J];电网技术;2011年08期
9 郝洪星;朱玉全;陈耿;李米娜;;基于划分和层次的混合动态聚类算法[J];计算机应用研究;2011年01期
10 苏锦旗;薛惠锋;詹海亮;;基于划分的K-均值初始聚类中心优化算法[J];微电子学与计算机;2009年01期
,本文编号:2040556
本文链接:https://www.wllwen.com/kejilunwen/zidonghuakongzhilunwen/2040556.html