基于改进k-means算法的海量智能用电数据分析
发布时间:2019-01-18 15:48
【摘要】:针对智能用电数据挖掘面临数据量大、挖掘效率低等难题,进行Map-Reduce模型下基于改进k-means的海量用电数据分析研究。以家庭用户为例,建立了家庭用户用电信息的家庭用户号、房屋面积、家庭成员数、每天用电量、峰谷电量、家用电器数等的数据维度模型,利用k-means算法简单、收敛速度快的优势,克服其容易陷入局部最优解的缺陷,综合考虑初始聚类中心的选择及聚类个数的选取2个因素,以数据对象密度的大小作为初始聚类中心的选取标准,将簇间距离及簇内对象的分散程度作为聚类数目选择的重要参考,对k-means算法进行改进;为提高数据处理效率,进行Map-Reduce处理模型下的海量家庭用户用电数据的并行挖掘。通过在Hadoop集群上进行实验,结果证明提出的算法运行稳定、高效、可行,且具有良好的加速比。
[Abstract]:Aiming at the problems of large data volume, low mining efficiency and the like, the intelligent power consumption data mining is carried out under the Map-Reduce model, and the mass utilization data analysis research based on the modified k-means is carried out under the Map-Reduce model. a household user number, a house area, a family member number, a daily power consumption, a peak-to-valley electric quantity, a household appliance number and the like are established by using a household user as an example, a data dimension model such as a peak-to-valley electric quantity, a household appliance number and the like is established, the defect that the local optimal solution is easy to fall into a local optimal solution is overcome, the selection of the initial poly-type center and the selection of the number of the poly-classes are comprehensively considered, the size of the data object density is taken as the selection criterion of the initial clustering center, In order to improve the data processing efficiency, the parallel mining of the data of mass home users under the Map-Reduce processing model is carried out to improve the data processing efficiency. The results show that the proposed algorithm is stable, efficient and feasible and has a good speedup ratio by doing the experiments on the Hadoop cluster.
【作者单位】: 重庆市电力公司;
【分类号】:TM769;TP311.13
[Abstract]:Aiming at the problems of large data volume, low mining efficiency and the like, the intelligent power consumption data mining is carried out under the Map-Reduce model, and the mass utilization data analysis research based on the modified k-means is carried out under the Map-Reduce model. a household user number, a house area, a family member number, a daily power consumption, a peak-to-valley electric quantity, a household appliance number and the like are established by using a household user as an example, a data dimension model such as a peak-to-valley electric quantity, a household appliance number and the like is established, the defect that the local optimal solution is easy to fall into a local optimal solution is overcome, the selection of the initial poly-type center and the selection of the number of the poly-classes are comprehensively considered, the size of the data object density is taken as the selection criterion of the initial clustering center, In order to improve the data processing efficiency, the parallel mining of the data of mass home users under the Map-Reduce processing model is carried out to improve the data processing efficiency. The results show that the proposed algorithm is stable, efficient and feasible and has a good speedup ratio by doing the experiments on the Hadoop cluster.
【作者单位】: 重庆市电力公司;
【分类号】:TM769;TP311.13
【参考文献】
相关期刊论文 前10条
1 李智勇;吴晶莹;吴为麟;宋保明;;基于自组织映射神经网络的电力用户负荷曲线聚类[J];电力系统自动化;2008年15期
2 肖世杰;;构建中国智能电网技术思考[J];电力系统自动化;2009年09期
3 刘友波;刘俊勇;赵岩;李磊;胥威汀;姚s,
本文编号:2410866
本文链接:https://www.wllwen.com/kejilunwen/dianlilw/2410866.html
教材专著