基于正态分布的密度峰聚类算法的研究
[Abstract]:Clustering algorithm is an important machine learning algorithm which divides data sets into several categories according to similarity characteristics. Clustering analysis is widely used in machine learning, pattern recognition, bioinformatics and image processing. In 2014, Alex Rodriguez et al proposed a new density-based density peak clustering (clustering by fast search and find of density peaks,DPC) algorithm on < Science >. The algorithm uses the density of data points and the distance between the data points and the higher density points to find the potential cluster centers. The density peak clustering algorithm is simple and clear, and the clustering results can be obtained in one step, and the clustering effect is better. But in the process of clustering, the algorithm needs to participate in the analysis of decision graph and select the potential cluster core, which reduces the efficiency of the algorithm. In order to achieve the purpose of automatic clustering, this paper presents a method of selecting potential cluster centers by using the multiplier Z of density and distance as a new judgement index and selecting cluster centers by probability and statistics according to the characteristics of each point in the decision graph. Because only the potential cluster centers have higher density and longer distance, their Z value is much larger than that of non-cluster centers. Assuming that the distribution of Z is a normal distribution, an upper bound can be determined by the method of probability and statistics. The point corresponding to the value above the upper bound will automatically be regarded as the cluster center point. The experimental results show that the probabilistic statistical method such as normal distribution can correctly identify the potential cluster center points, and the method is similar to the method of selecting the potential cluster center in the artificial analysis decision map. Compared with other excellent clustering algorithms, the density peak clustering algorithm based on normal distribution has better performance in dealing with different shape data sets, and can obtain better clustering results.
【学位授予单位】:浙江工业大学
【学位级别】:硕士
【学位授予年份】:2016
【分类号】:TP311.13
【相似文献】
相关期刊论文 前10条
1 陶勇剑;董德存;任鹏;;故障树分析的二元决策图方法[J];铁路计算机应用;2009年09期
2 朱随江;刘宇;刘宝旭;姜政伟;;基于二叉决策图的网络可达性计算[J];计算机工程与应用;2012年04期
3 邱建林;二叉决策图在逻辑综合中的应用[J];微机发展;2002年01期
4 ;数理科学与基础理论[J];电子科技文摘;2001年03期
5 何明;权冀川;郑翔;赖海光;杨飞;;基于二元决策图的网络可靠性评估[J];控制与决策;2011年01期
6 李道丰;张增芳;;基于有序二叉决策图的路径规划可行性研究[J];计算机工程与设计;2008年22期
7 纪明宇;王海涛;陈志远;李艳梅;;基于决策图的复杂系统模型对称约减方法[J];计算机工程与设计;2013年10期
8 孙艳蕊,张祥德;利用二分决策图计算网络可靠度的一个有效算法[J];东北大学学报;1998年05期
9 王波,邱建林,管致锦;集成电路中布尔线路图的优化设计[J];南通工学院学报;2001年03期
10 姚金涛;刘财兴;孔宇彦;;基于决策图贝叶斯网络的混沌优化算法[J];系统仿真学报;2008年12期
相关会议论文 前1条
1 郭红仙;王际芝;;廊坊市计算机辅助减灾决策图文数据库[A];第四届全国结构工程学术会议论文集(下)[C];1995年
相关博士学位论文 前2条
1 李淑敏;决策图扩展方法及其在重要度计算中的应用[D];西北工业大学;2014年
2 赖永;带蕴含文字的有序二元决策图[D];吉林大学;2013年
相关硕士学位论文 前4条
1 吴丹丹;基于决策图的高速公路网连通可靠性研究[D];浙江师范大学;2016年
2 郑P;基于正态分布的密度峰聚类算法的研究[D];浙江工业大学;2016年
3 王乐;基于可能性决策图的可能性规划[D];东北师范大学;2011年
4 乔迪;光网络可靠性评估模型和算法研究[D];北京邮电大学;2014年
,本文编号:2236063
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/2236063.html