当前位置:主页 > 科技论文 > 软件论文 >

基于Hadoop的农业大数据处理系统研究

发布时间:2018-04-09 10:38

  本文选题:大数据 切入点:农业大数据 出处:《河南师范大学》2017年硕士论文


【摘要】:我国地域广阔、生态类型复杂多样、作物种类更是丰富繁多。因而,我国的农业数据也是种类多样、体量巨大。由于传统农业的局限性,各类农业数据一直没有被重视、充分的利用起来。随着农业信息化的推进和农业现代化水平的提高,各类农业数据开始受到人们的重视,发挥着越来越重要的作用,用于指导农业生产。随着物联网等技术在农业上大量使用,农业数据的数据量呈几何递增,传统的数据处理方式已不能满足农业数据的处理需求。农业数据已经逐渐满足大数据的基本特性,成为农业大数据。由于农业自身的特点使得农业大数据具有大量、多维、动态等特征。如何合理高效的应对农业大数据的发展,是一个非常重要的问题。大数据技术的飞速发展可以很好地解决农业大数据所面临的诸多难题。而最受关注的大数据处理平台,无疑是谷歌公司的Hadoop。Hadoop是一个开源的、可运行于大规模集群上的分布式计算平台,其实现了MapReduce计算模型,得到了广泛地应用并逐渐成为大数据的代名词。MapReduce是由Google公司最早提出的,是一种并行编程模型,可用于大规模数据集的并行运算,是Google的核心计算模型[1]。Map函数、Reduce函数是MapReduce模型的核心,它们都利用key,value的数据结构将将复杂的数据处理任务分布到各个计算机节点上,并利用分布式并行架构来处理海量的复杂数据。本文对大数据的特点进行分析,根据农业大数据的特点,对现有的农业大数据处理系统的优势和不足进行分析和改进,设计了基于Hadoop平台的农业大数据处理系统。本文对经典的数据挖掘进行了简要的介绍,并针对MapReduce架构对相应算法的并行化进行分析。将CART算法针对MapReduce架构进行并行化改进,并对该算法进行相应的优化。最后,将数据在系统中运行,验证该系统的可行性以及算法改进后具有更高的性能。
[Abstract]:China has a vast area, complex ecological types, and a wide variety of crops.As a result, the agricultural data of our country is also diverse, the volume is huge.Because of the limitation of traditional agriculture, all kinds of agricultural data have not been paid attention to and fully utilized.With the development of agricultural informatization and the improvement of agricultural modernization, people begin to pay attention to all kinds of agricultural data and play a more and more important role in guiding agricultural production.With the extensive use of the Internet of things and other technologies in agriculture, the amount of agricultural data is increasing geometrically, and the traditional data processing methods can not meet the needs of agricultural data processing.Agricultural data has gradually satisfied big data's basic characteristics, become agricultural big data.Due to the characteristics of agriculture itself, agricultural big data has a large number of, multidimensional, dynamic and other characteristics.How to deal with the development of agricultural big data reasonably and efficiently is a very important problem.The rapid development of big data's technology can solve many difficult problems faced by agricultural big data.And the most concerned big data processing platform, undoubtedly, is Google's Hadoop.Hadoop, an open source distributed computing platform that can run on large clusters. It implements the MapReduce computing model.MapReduce, which has been widely used and has gradually become big data's pronoun, was first put forward by Google Company. It is a parallel programming model that can be used in parallel operation of large-scale data sets.It is the core computing model of Google [1] .Map function and reduce function is the core of MapReduce model. They all use the data structure of key value to distribute complex data processing tasks to each computer node.A distributed parallel architecture is used to deal with large amounts of complex data.This paper analyzes the characteristics of big data, analyzes and improves the advantages and disadvantages of the existing agricultural big data processing system according to the characteristics of agricultural big data, and designs an agricultural big data processing system based on Hadoop platform.This paper briefly introduces the classical data mining, and analyzes the parallelization of the corresponding algorithms based on the MapReduce architecture.The CART algorithm is parallelized to the MapReduce architecture, and the algorithm is optimized accordingly.Finally, the data is run in the system to verify the feasibility of the system and the improved algorithm has higher performance.
【学位授予单位】:河南师范大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:S126;TP311.13

【参考文献】

相关期刊论文 前10条

1 王伟;;大数据环境下的管理信息系统课程在线教学改革探索[J];福建电脑;2017年01期

2 林克全;;基于模拟用户行为的自动巡检系统[J];数字技术与应用;2017年01期

3 黎玲萍;毛克彪;付秀丽;马莹;王芳;刘R,

本文编号:1726089


资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/1726089.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户b99c1***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com