Spark环境下基于频繁边的大规模单图采样算法
[Abstract]:With the popularity of social networks, the demand for frequent subgraph mining is becoming more and more intense. With the arrival of big data era, the scale of social network continues to expand, and it becomes more and more difficult to mine the frequent sub-graph. In practical applications, it is often not necessary to mine frequent subgraphs accurately. The sampling method can significantly improve the efficiency of frequent subgraphs mining on the premise of ensuring a certain accuracy. Most of the existing sampling algorithms are based on the degree of nodes and are not suitable for frequent subgraph mining. In this paper, a sampling algorithm based on frequent edges (DIMSARI (distributed Monte Carlo sampling algorithm based on random jump and graph induction),) is proposed. Based on Monte Carlo algorithm, the random hop operation based on frequent edges is added, and the graph induction operation is carried out on the results. The accuracy of the algorithm is further improved, and the unbiased property of the method is proved theoretically. The experimental results show that the accuracy of frequent sub-graph mining using DIMSARI algorithm is much higher than that of other sampling algorithms, and the node degree of sub-graph sampled at different sampling rates keeps a smaller normalized mean square deviation.
【作者单位】: 宁波大学信息科学与工程学院;
【基金】:国家自然科学基金项目(61572266,61472194) 浙江省自然科学基金项目(Y16F020003) 宁波市自然科学基金项目(2017A610114)~~
【分类号】:TP301.6
【相似文献】
相关期刊论文 前10条
1 章立亮,周琼;光栅图形反走样的加权区域采样算法[J];宁德师专学报(自然科学版);2002年01期
2 曹鹏;李博;栗伟;赵大哲;;基于概率分布估计的混合采样算法[J];控制与决策;2014年05期
3 余纯;张太荣;;基于硬件实现的粒子滤波重采样算法研究[J];自动化技术与应用;2013年02期
4 张秀丽,李萍,陆光华;高精度软件同步采样算法[J];电力系统及其自动化学报;2005年04期
5 赵丰;汤磊;张武;赵宗贵;;一种高实时性粒子滤波重采样算法[J];系统仿真学报;2009年18期
6 冯驰;赵娜;王萌;;一种改进残差重采样算法的研究[J];哈尔滨工程大学学报;2010年01期
7 李蕴奇;李小明;何杰;钟鸣;;关于吉布斯采样算法识别MOTIF的研究[J];才智;2010年31期
8 郭建林;李爱玲;;一种大尺度Gauss模糊的快速采样算法[J];中国科学:信息科学;2011年10期
9 黄保虎;刘冉;张华;张昭;;基于不同重采样算法的RFID指纹定位[J];计算机应用;2013年02期
10 冯驰;王萌;汲清波;;粒子滤波器重采样算法的分析与比较[J];系统仿真学报;2009年04期
相关硕士学位论文 前4条
1 王柯翔;基于LWE问题的采样算法及应用研究[D];北京交通大学;2017年
2 邓俊;滤波重要性采样算法的研究与实现[D];天津大学;2007年
3 王朝;基于ARMS的并行采样算法的设计与实现[D];天津大学;2008年
4 崔承勋;基于GH-distance的自适应性采样算法[D];天津大学;2009年
,本文编号:2432631
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/2432631.html