基于Glusterfs的森林资源监测云平台建立方法的研究

发布时间：2018-10-16 17:14

【摘要】：随着林业信息化建设的推进,越来越多的新技术、新方法在林业上得到了应用。要对森林领域进行深入的研究,森林资源监测数据的获取是必不可少的。现在,我国森林资源监测对像多(如气候、土壤、水文、水质、视频等),分布范围广,采集的数据量大,数据的形式多样,结构不统一。只有对辛苦得来的宝贵森林监测数据进行有效的存储、管理和利用,才能对林业问题的分析、决策提供客观、全面的支持。本文针对森林资源监测数据的大数据问题,不同类型数据的存储问题以及数据的并行存储和计算问题,进行了以下研究。首先,针对森林资源监测数据分布范围广,相互间相对孤立以及监测数据量大等问题。本文提出了一种基于Glusterfs的森林资源监测云平台的构建方法；通过云平台的构建,可以解决各地区监测数据相对孤立的问题,把所有的监测数据存储在构建的统一虚拟资源存储池中,实现数据的逻辑统一,有利于数据的共享和存储资源利用率的提高。其次,针对森林资源监测数据格式和类型的多样性等问题。本文通过分析森林监测数据的逻辑结构,大致将监测数据分为结构化数据和非结构化数据,并设计了一种混合式的存储机制,即根据数据的不同类型将其分别存储在关系型数据库和NoSQL数据库中；从而解决了不同数据类型的数据在同一系统中存储的问题。再次,针对系统数据存储的均衡性问题,本文运用了一致性哈希算法,从而使得整个系统的数据存储和计算更加均衡；针对数据处理过程中的并行化问题,本文设计的基于Glusterfs的森林监测云平台,采用MapReduce计算模型和并行数据库技术,实现了数据的并行化计算和存储。最后,本文对搭建的简易Glusterfs云平台的可靠性、扩展性、弹性及消除元数据性能等方面做了测试。实验结果表明系统整体性能优异。同时,在锁问题和文件遍历问题上还需要继续改进。
[Abstract]:With the development of forestry information construction, more and more new technologies and methods have been applied in forestry. The acquisition of forest resources monitoring data is essential to the in-depth study of the forest field. At present, there are many monitoring objects (such as climate, soil, hydrology, water quality, video, etc.) in China, which have a wide range of distribution, large amount of data collected, various forms of data, and disunity of structure. Only through the effective storage, management and utilization of valuable forest monitoring data can we provide objective and comprehensive support for the analysis and decision making of forestry problems. In this paper, big data problem of forest resource monitoring data, storage problem of different types of data and parallel storage and computation of data are studied as follows. Firstly, aiming at the problems of wide distribution of forest resources monitoring data, relative isolation of each other and large amount of monitoring data, etc. In this paper, a method of constructing forest resource monitoring cloud platform based on Glusterfs is put forward, which can solve the problem of relative isolation of monitoring data in different regions. All the monitoring data are stored in the unified virtual resource storage pool to realize the logical unification of the data, which is conducive to the sharing of data and the improvement of the utilization rate of storage resources. Secondly, aiming at the diversity of forest resources monitoring data format and type, etc. By analyzing the logical structure of forest monitoring data, this paper roughly divides the monitoring data into structured data and unstructured data, and designs a hybrid storage mechanism. According to different types of data, they are stored in relational database and NoSQL database respectively, thus solving the problem of different data types stored in the same system. Thirdly, aiming at the equalization of system data storage, this paper uses the consistent hash algorithm to make the data storage and computation more balanced in the whole system, and aiming at the parallelization problem in the process of data processing, The forest monitoring cloud platform based on Glusterfs is designed in this paper. The parallel computing and storage of data is realized by using MapReduce computing model and parallel database technology. Finally, the reliability, extensibility, elasticity and performance of the simple Glusterfs cloud platform are tested. The experimental results show that the overall performance of the system is excellent. At the same time, the problems of lock and file traversal still need to be improved.
【学位授予单位】：东北林业大学
【学位级别】：硕士
【学位授予年份】：2014
【分类号】：S757.2

【参考文献】