当前位置:主页 > 科技论文 > 自动化论文 >

基于粒子群的关联规则挖掘算法研究

发布时间:2018-08-02 12:44
【摘要】:关联规则分析是数据挖掘中最主要的分支,其主要目的就是为了挖掘存在于事务数据库中隐藏的关系或者联系。随着大数据的普及,传统的关联规则挖掘算法暴露出的问题越来越明显,使得算法的挖掘效率也有所下降。粒子群优化算法作为一种群智能优化算法的代表,近年来被广泛应用于不同的领域,其中就包括关联规则分析方面。本文就是通过将粒子群优化算法与关联规则挖掘算法相结合,对关联规则挖掘算法提出改进思路。为了满足关联规则挖掘得到的规则信息能够随着时间的变化而变化,采用粒子群优化的灰色模型对动态关联规则定义中的支持度向量和置信度向量做出趋势预测,以便让决策者及时掌握事情的发展动态,为其做出决策提供参考依据。为了能够更好的对关联规则挖掘算法进行研究,在阅读了大量参考文献之后,对国内外现状做出分析,发现了该领域当前存在的一些问题,以此来提出本文所要研究的主要内容。首先对关联规则的基本概念及其原理、分类、经典的算法和改进的算法进行介绍,关联规则挖掘的目的和意义有了初步认识,然后对动态关联规则的定义和算法思想进行分析,了解到动态关联规则与关联规则的区别,最后对粒子群优化算法的原理、步骤以及对遗传算法的比较做出分析,以便于为粒子群优化算法和关联规则算法相结合提供依据。针对经典的Apriori算法在处理大型数据库时挖掘效率有所下降,提出了一种基于二阶粒子群的关联规则挖掘算法。该算法共分四个步骤,首先第一步按照每个分区都能放进内存的原则,采用Partition算法对整个数据库进行不重叠划分;其次采用Apriori算法对每个分区的数据集进行关联规则提取;然后采用二阶粒子群优化算法对挖掘得到的关联规则进行优化分析,提取出一些易被忽略的有价值的规则;最后全局合并各个分区的关联规则,并计算其实际的支持度和置信度。该算法不仅能够减少数据库的扫描次数,而且能够提取出因单个参考标准而被忽略的关联规则。通过在Matlab平台上实现该算法,在不同数据集上进行了对比实验,也对比了许多同类算法,实验表明该算法是可行并且是有效的。针对动态关联规则挖掘中规则变化趋势的分析,提出一种改进的粒子群优化的灰色模型,该算法在粒子群算法中引入二次搜索机制,提高了算法的收敛性能,同时将其应用到灰色模型中,优化灰色模型在不同时刻的背景值,提高灰色模型的预测精度。通过在Matlab平台上实现该算法,对比了不同算法的预测精度,实验结果表明,预测精度达到了等级好的标准,能够满足正常的预测需求。在对改进的算法进行了一系列的对比实验,已经能够证明所要实现算法的可行性和有效性,但仍然需要在实际应用方面做出实验,本文选取了流动人口普查数据进行关联规则分析,首先选取跨省流动属性作为依据,分析跨省流动人员的特征,比如年龄、民族、户口类型和受教育程度等,然后对跨省流动人员的流动原因进行了关联规则挖掘操作,得到流动原因的特征。通过两方面的分析为相关部门加强人员管理方面提供建设性的意见,同时从挖掘结果来看证明了改进算法的实际价值和意义,保证了算法研究的严谨性。
[Abstract]:Association rule analysis is the most important branch of data mining. Its main purpose is to excavate the hidden relationships or connections in the transaction database. With the popularization of large data, the problems of the traditional association rules mining algorithms are becoming more and more obvious, and the efficiency of the algorithm is also reduced. Particle swarm optimization algorithm (PSO) is used. As a representative of a population intelligent optimization algorithm, it has been widely used in different fields in recent years, including the analysis of association rules. By combining particle swarm optimization with association rules mining algorithm, this paper proposes an improved approach to association rule mining algorithm. In order to satisfy the rule mining of association rules mining, With the change of time, the grey model of particle swarm optimization is used to predict the trend of the support vector and confidence vector in the definition of dynamic association rules, so that the decision-makers can grasp the development trend in time and provide the reference for making decisions. In order to better the algorithm for mining association rules. After reading a large number of references, we have made an analysis of the current situation at home and abroad and found some existing problems in this field, so as to put forward the main contents of this paper. First, the basic concepts and principles of the association rules, the classification, the classical algorithms and the improved algorithms are introduced, and the association rules are excavated. The purpose and significance have a preliminary understanding, then the definition and algorithm of dynamic association rules are analyzed, and the difference between dynamic association rules and association rules is understood. Finally, the principle, steps and comparison of the genetic algorithms are made to the particle swarm optimization algorithm and the association rule algorithm. The efficiency of the classical Apriori algorithm in processing large databases has been reduced. A Association Rule Mining Algorithm Based on the two order particle swarm is proposed. The algorithm is divided into four steps. First, the first step is based on the principle that each partition can be put into memory, and the Partition algorithm is used for the whole database. Secondly, the Apriori algorithm is used to extract the association rules of each partition, and then the two order particle swarm optimization algorithm is used to optimize the mining association rules and extract some valuable rules that are easily ignored. Finally, the global merging of the association rules of each partition and the calculation of the actual support are also calculated. The algorithm can not only reduce the number of scanning of the database, but also can extract the association rules ignored by a single reference standard. Through the implementation of the algorithm on the Matlab platform, the comparison experiments on different data sets are carried out, and many similar algorithms are compared. The experiments show that the algorithm is feasible and has a good effect. According to the analysis of rule change trend in dynamic association rules mining, an improved grey model of particle swarm optimization is proposed. The algorithm introduces two search mechanism in particle swarm optimization, and improves the convergence performance of the algorithm. At the same time, the algorithm is applied to the grey model to optimize the background value of the grey model at different time and improve the grey model. The prediction accuracy of the color model is achieved by implementing this algorithm on the Matlab platform. The prediction accuracy of different algorithms is compared. The experimental results show that the prediction accuracy reaches a good grade standard and can meet the normal prediction requirements. A series of contrast tests on the improved algorithm have proved that the algorithm is feasible. And effectiveness, but still need to make experiments in practical application. This paper selects the data of the mobile population census to analyze the association rules. First, we select the cross provincial flow attribute as the basis to analyze the characteristics of the cross provincial mobile personnel, such as age, nationality, type of household registration and education, and then the reasons for the flow of migrants across provinces. The association rules mining operation is carried out, and the characteristics of the flow cause are obtained. Through the analysis of two aspects, it provides constructive suggestions for the relevant departments to strengthen the personnel management. At the same time, the actual value and significance of the improved algorithm are proved by the results of the mining, and the rigor of the algorithm is ensured.
【学位授予单位】:兰州交通大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP311.13;TP18

【参考文献】

相关期刊论文 前10条

1 于守健;周羿阳;;基于前缀项集的Apriori算法改进[J];计算机应用与软件;2017年02期

2 郭世伟;孟昱煜;;一个基于二阶粒子群的关联规则挖掘算法[J];兰州交通大学学报;2016年03期

3 张忠林;石皓尹;宋航;;灰色-周期外延模型的动态关联规则元规则挖掘[J];计算机科学;2014年04期

4 高海洋;沈强;张轩溢;赵志军;;一种基于数据压缩的Apriori算法[J];计算机工程与应用;2013年14期

5 张忠林;许凡;;基于小波变换的动态关联规则元规则GM(1,1)挖掘[J];计算机科学;2013年05期

6 孟昱煜;;基于云关联规则的蚁群聚类算法研究[J];兰州交通大学学报;2011年04期

7 李伟;袁亚南;牛东晓;;基于缓冲算子和时间响应函数优化灰色模型的中长期负荷预测[J];电力系统保护与控制;2011年10期

8 刘丛林;张忠林;曾庆飞;;PSO算法在关联规则挖掘中的应用[J];兰州交通大学学报;2010年03期

9 张忠林;刘俊;谢彦峰;;AR-Markov模型在动态关联规则挖掘中的应用[J];计算机工程与应用;2010年14期

10 沈斌;姚敏;;一种新的动态关联规则及其挖掘算法[J];控制与决策;2009年09期

相关博士学位论文 前2条

1 何月顺;关联规则挖掘技术的研究及应用[D];南京航空航天大学;2010年

2 沈斌;关联规则相关技术研究[D];浙江大学;2007年

相关硕士学位论文 前5条

1 石皓尹;基于灰色系统理论的动态关联规则挖掘研究[D];兰州交通大学;2014年

2 李显龙;基于数据挖掘的内蒙古流出人口特征分析[D];吉林大学;2013年

3 金涛;PSO-SA算法的改进及其在关联规则挖掘中的应用研究[D];华中师范大学;2013年

4 段玉琴;数据挖掘中关联规则算法的研究[D];西安电子科技大学;2011年

5 刘进锋;动态关联规则的理论与应用研究[D];浙江大学;2006年



本文编号:2159478

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/zidonghuakongzhilunwen/2159478.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户b1f9d***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com