分布式并行计算环境下GML空间数据的划分策略及算法研究
[Abstract]:GML is widely used in many fields because of its simplicity, semi-structure, interoperability, openness, versatility and flexibility. With the development of solving problems in the field of geographic information, the problems encountered are becoming more and more complex and larger. The efficiency optimization and performance improvement of the traditional spatial data storage and spatial analysis algorithm based on GIS can not meet the needs of massive data storage and spatial operation. Using distributed parallel computing platform can solve this problem well. The merits and demerits of distributed parallel systems depend to a great extent on the quality of data partitioning strategies, but the present spatial data partitioning methods do not take spatial association into account. Therefore, in view of a spatial data partition method which is suitable for GML spatial data, considering load balance, proximity degree, area balance and spatial correlation relationship, this paper has obtained the following research results: first, The shortcomings of spatial data partitioning based on Hilbert spatial permutation code and spatial data partitioning based on K-average clustering algorithm are studied and analyzed. The former is not good at maintaining the equilibrium of the spatial data of each node, while the latter is unstable because of the uncertainty of the initial centroid. Secondly, combining Hilbert spatial permutation code and K-average clustering algorithm, and considering the spatial correlation of objects, a new GML data partition algorithm is proposed. The algorithm takes into account the load balance of each node, the proximity of the object, the area balance and the spatial correlation between the objects. Finally, according to the proposed GML spatial data partition algorithm, the GML distributed storage system is analyzed and designed, and the data partition module of the distributed parallel GML storage system based on the Hadoop platform is completed. The load balance of the data partition algorithm is verified by the system, and the parallel speedup ratio of Oracle Spatial and spatial data partition algorithm based on K-average clustering is compared and compared with that of Hilbert code partition algorithm. The results show that the partition algorithm has good load balance and excellent parallel query efficiency.
【学位授予单位】:江西理工大学
【学位级别】:硕士
【学位授予年份】:2012
【分类号】:P208;TP338.6
【参考文献】
相关期刊论文 前8条
1 胡敏;付t ;;对几种典型分布式计算技术的比较[J];电脑知识与技术;2010年05期
2 李芳芳;;云计算现状综述[J];电脑知识与技术;2011年04期
3 贾婷;魏祖宽;唐曙光;金在弘;;一种面向并行空间查询的数据划分方法[J];计算机科学;2010年08期
4 龚明;;网格技术[J];科技广场;2006年11期
5 张叶红;;云中漫步:图书馆云计算应用[J];农业图书情报学刊;2010年12期
6 赵春宇;孟令奎;林志勇;;一种面向并行空间数据库的数据划分算法研究[J];武汉大学学报(信息科学版);2006年11期
7 王永杰;孟令奎;赵春宇;;基于Hilbert空间排列码的海量空间数据划分算法研究[J];武汉大学学报(信息科学版);2007年07期
8 黄镇圣;;云计算技术与应用分析[J];网络财富;2010年12期
相关博士学位论文 前1条
1 陈建华;原生模式GML空间数据管理机制研究[D];成都理工大学;2008年
相关硕士学位论文 前10条
1 胡清;基于云计算的券商网络营销服务平台研究与设计[D];南昌大学;2010年
2 张开;动态可重构计算中程序热点识别关键技术研究[D];国防科学技术大学;2010年
3 霍树民;基于Hadoop的海量影像数据管理关键技术研究[D];国防科学技术大学;2010年
4 王宇;分布式并行数据库系统DP-SQL的恢复机制[D];电子科技大学;2003年
5 宋静;分布式并行数据库一致性机制研究与实现[D];电子科技大学;2006年
6 姜素芳;GML数据存储与索引机制的研究与实现[D];江苏大学;2006年
7 马冬青;基于Oracle XML DB技术的GML数据存储研究[D];中南大学;2008年
8 马伟明;基于遗传算法的分布式任务调度系统的分析[D];大连理工大学;2008年
9 叶梓;简单要素模型并行化空间运算研究与实现[D];中国地质大学;2009年
10 向晓明;基于分布式蚁群算法的TSP问题研究[D];西南交通大学;2009年
,本文编号:2398142
本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/2398142.html