Nyquist sampling theorem Bid data stream Sliding window Lag
本文关键词:大数据流滞后相关性挖掘方法,,由笔耕文化传播整理发布。
基于级数分层滑动窗口的大数据流滞后相关性挖掘方法
LAG CORRELATION MINING METHOD FOR BIG DATA STREAM BASED ON SERIES LAYERED SLIDING WINDOW
[1] [2] [3]
Ren Yonggong ,Qian Haizhen, Lang Hongyu ( School of Computer and Information Technology,Liaoning Normal University, Dalian 116029,Liaoning, China)
辽宁师范大学计算机与信息技术学院,辽宁大连116029
文章摘要:针对大数据流序列挖掘过程中,不能快速发现序列滞后相关性的问题,提出一种基于级数分层滑动窗口的大数据流序列滞后相关性挖掘方法。该方法首先对序列按级数递增进行分层,在每层上计算滑动窗口的覆盖能力g;之后再对每层的滑动窗口计算序列的参数值;最后根据各层滑动窗口的参数值,计算序列的滞后相关系数,以此来确定序列的滞后相关性。在序列滞后相关性的求解过程中,通过奈奎斯特抽样定理证明了需要计算大数据流n个序列的log2(n)个点,就能高精度地确定序列的滞后相关性。这大大减少了计算时间,并且序列越多,计算误差越小,效率越高。实验结果表明,该方法可以大幅度地减少运算时间,在保证精度的情况下提高运算效率,尤其对大数据流序列,效果良好,应用前景广阔。
Abstr:For the problem that in big data stream sequence mining process the lag correlation of sequence cannot be found quickly, the paper proposes a lag correlation mining method for big data stream which is based on series layered sliding window. The method first stratifies the sequence according to the increment of series, calculates the coverage g of sliding windows on each layer, and then figures up the sequence parameter values on sliding windows of each layer; According to the parameter values of sliding window of each layer, it calculates the lag correlation coefficient of sequence, in this way it determines the lag correlation of sequence. In the solving process of sequence lag correlation, through Nyquist sampling theorem it is proved that the need of computing log2 (n) of n sequences of big data stream only can determine the lag correlation of sequence with high precision. This greatly reduces the computation time, and the more the sequence, the smaller the error and the higher the efficiency. Experimental results show that the improved method can greatly reduce computation time, and improve the operation efficiency under the condition of ensuring precision, especially for large data flow sequence, the method has better effect and broad application prospect.
文章关键词:
Keyword::Nyquist sampling theorem Bid data stream Sliding window Lag correlation
课题项目:辽宁省自然科学基金项目(201202119);辽宁省科学计划项目(2013405003);大连市科技计划项目(2013A16GX116).
本文关键词:大数据流滞后相关性挖掘方法,由笔耕文化传播整理发布。
本文编号:180366
本文链接:https://www.wllwen.com/shoufeilunwen/xixikjs/180366.html