基于云计算和机器学习的短期风电功率预测研究

发布时间：2018-10-26 15:37

【摘要】：随着我国能源结构的调整,风电装机容量快速增长,及时准确地预测风电功率可为电网合理调度提供重要依据,减少弃风,有效地提高风电利用率。同时,随着风电场智能化水平的提高,风电监测数据规模不断增长,对传统风电功率预测模型的计算性能提出了新的挑战。近年来,以机器学习理论为基础的人工神经网络法和支持向量机法及其改进算法在短期风电功率预测中得到广泛应用,机器学习算法中存在较多迭代计算场景,云计算技术中的Spark分布式内存计算框架,可高效进行迭代式数据处理,有效提高算法的执行性能。针对现有短期风电功率预测模型存在泛化性较弱、模型结构和参数确定困难、可解释性差等问题,本文综合随机森林回归算法、M5P模型树、差分进化算法、选择性集成方法,提出了一种基于改进随机森林回归算法的短期风电功率预测方法,并采用Spark云计算平台实现算法并行化,主要开展了以下几个方面的研究工作:(1)传统随机森林回归算法以分类回归树为元决策树,针对分类回归树预测精度较低、不能给出一个连续的输出且预测值无法超出训练集数据范围等问题,本文采用M5P模型树作为元决策树,在叶节点上构造多元线性回归模型,有效提高了元决策树的预测精度。(2)针对随机森林中存在部分预测性能较差且多样性较低的元决策树,本文提出了一种改进的差分进化算法,并将其应用到随机森林元决策树的选择性集成中,在所有元决策树中选择部分最优的元决策树子集构成新的随机森林,进行加权计算得到最终预测结果。(3)针对随机森林算法计算复杂度较高的问题,分析了随机森林算法和差分进化算法的并行性,研究了云计算体系架构,采用云计算技术中的Spark分布式内存计算框架,对上述预测算法进行并行化改进,有效提高了算法的执行性能。(4)以内蒙古某地区风电监测数据作为实际算例,将本文方法与现有短期风电功率预测算法和传统的随机森林回归算法进行对比;同时在实验室服务器上采用Cloudera公司的发行版CDH5版本搭建云计算平台,对提出的算法进行并行化性能测试。实验结果表明本文提出的方法具有较高的预测精度、泛化性能、可解释性,且具有较好的并行性能。
[Abstract]:With the adjustment of energy structure in China, the installed capacity of wind power is increasing rapidly. Forecasting wind power accurately and timely can provide an important basis for the reasonable dispatch of power grid, reduce the abandonment of wind, and effectively improve the utilization rate of wind power. At the same time, with the improvement of the intelligent level of wind farm, the scale of wind power monitoring data is increasing, which poses a new challenge to the computational performance of traditional wind power prediction model. In recent years, artificial neural network (Ann), support vector machine (SVM) and its improved algorithms based on machine learning theory have been widely used in short-term wind power prediction, and there are many iterative computing scenarios in machine learning algorithms. The Spark distributed memory computing framework in cloud computing technology can efficiently perform iterative data processing and improve the performance of the algorithm. In view of the existing short-term wind power prediction model has some problems such as weak generalization, difficulty in determining the model structure and parameters, poor interpretability, etc., this paper synthesizes stochastic forest regression algorithm, M5P model tree, differential evolution algorithm, selective integration method, etc. A short-term wind power prediction method based on improved stochastic forest regression algorithm is proposed, and the algorithm is parallelized using Spark cloud computing platform. The main research work is as follows: (1) the traditional stochastic forest regression algorithm takes the classification regression tree as the meta-decision tree, aiming at the low prediction accuracy of the classification regression tree. In this paper, we use M5P model tree as meta-decision tree to construct multivariate linear regression model on leaf node. The prediction accuracy of meta-decision tree is improved effectively. (2) an improved differential evolutionary algorithm is proposed to solve the problem of partial poor prediction performance and low diversity of meta-decision trees in random forests. It is applied to the selective ensemble of stochastic forest meta-decision tree, and the partial optimal subset of meta-decision tree is selected among all meta-decision trees to form a new random forest. The final prediction results are obtained by weighted computation. (3) aiming at the high computational complexity of stochastic forest algorithm, the parallelism of stochastic forest algorithm and differential evolution algorithm is analyzed, and the cloud computing architecture is studied. The Spark distributed memory computing framework in cloud computing technology is adopted to improve the performance of the algorithm effectively. (4) the wind power monitoring data in Inner Mongolia is taken as an example. The proposed method is compared with the existing short-term wind power prediction algorithm and the traditional stochastic forest regression algorithm. At the same time, we use the CDH5 version of Cloudera company to build the cloud computing platform on the laboratory server, and test the parallelization performance of the proposed algorithm. The experimental results show that the proposed method has high prediction accuracy, generalization, interpretability and good parallelism.
【学位授予单位】：华北电力大学
【学位级别】：硕士
【学位授予年份】：2017
【分类号】：TM614;TP3;TP181

【参考文献】