当前位置:主页 > 科技论文 > 计算机论文 >

基于Hadoop的分布式数据存储设计与实现

发布时间:2018-03-01 19:01

  本文关键词: 分布式 Hadoop 海量数据存储 云计算 出处:《吉林大学》2016年硕士论文 论文类型:学位论文


【摘要】:在互联网蓬勃发展的今天,随处可以听到有关云计算的种种应用,过多的耳濡目染似乎也就标志着云时代的真正到来。因此,云端产品自然成为云时代下最为热门的产物。另外,由于传统的Web应用是将数据保存到服务器下数据库系统中,随着用户的数据量增多,传统的web项目面临着巨大的挑战。不仅如此,服务器维护成本也非常高。将数据直接保存到云服务器上面,云存储不仅可以保障数据的安全,为数据存储提供足够的空间,而且会大大降低维护成本。本文的主要工作是设计并实现云端存储。在Hadoop云环境中,利用Web服务器程序在web下操作Hadoop分布式文件系统API实现云存储应用。系统采用Struts2,Hibernate3和Spring3三大框架开发J2EE的MVC三层架构应用,采用Log4j配置和规范控制台输出的系统日志信息,XFire开发WebService,提供相应的服务,JAVA对Apache下的mail开发并实现邮件的发送,页面处理上主要采用JQuery实现页面无刷新操作,考虑到该应用中用户信息数据量小,因此采用MySQL进行用户数据信息的管理,避免HDFS对文件的循环遍历。从系统功能上讲,基本实现一个云端数据文件存储系统,用户可以随时随地通过浏览器访问并管理自己的数据文件,进行上传,下载,删除,分享等操作;同时可以管理自己的用户基本信息。区别与普通的web系统,该系统是分布式系统架构,访问速度、响应速度明显快于普通web应用,不仅如此,由于Hadoop的HDFS会自动将数据文件进行备份,存储到不同的集群环境下的从服务器上,所以不用担心一台服务器坏了,该机器上的文件就出现无法访问的情况,因为其它的机器会担当起这样的角色对文件进行管理。
[Abstract]:With the rapid development of the Internet, all kinds of cloud computing applications can be heard everywhere. Too much osmosis seems to mark the real arrival of the cloud age. Cloud products have naturally become the most popular product in the cloud era. In addition, because the traditional Web application is to store data in the database system under the server, with the increase of the user's data, Traditional web projects face enormous challenges. Not only that, server maintenance costs are also very high. To store data directly on the cloud server, cloud storage can not only guarantee the security of data, but also provide enough space for data storage. And will greatly reduce maintenance costs. The main work of this paper is to design and implement cloud storage. In the Hadoop cloud environment, The application of cloud storage is realized by using Web server program to operate Hadoop distributed file system API under web. The system uses Struts 2 + hibernate 3 and Spring3 framework to develop J2EE MVC three-tier architecture application. The system log information output from the Log4j configuration and specification console is used to develop the Web Service. The corresponding service Java is provided to develop the mail under Apache and to send the mail. In the page processing, the JQuery is mainly used to realize the page no refresh operation. Considering the small amount of user information in this application, MySQL is used to manage the user data information to avoid the circular traversal of files by HDFS. In terms of system functions, a cloud data file storage system is basically implemented. Users can access and manage their own data files, upload, download, delete, share and so on at any time and anywhere through the browser; at the same time, they can manage their users' basic information. The system is a distributed system architecture, accessing speed and responding speed is obviously faster than the ordinary web application. Not only that, because the HDFS of Hadoop will automatically backup the data files and store them on the slave server in different cluster environment. So don't worry about the failure of a server, the files on the machine will be inaccessible, because other machines will play such a role in the management of files.
【学位授予单位】:吉林大学
【学位级别】:硕士
【学位授予年份】:2016
【分类号】:TP333

【参考文献】

相关期刊论文 前5条

1 王海荣;刘珂;;基于Hadoop的海量数据存储系统设计[J];科技通报;2014年09期

2 罗彬;阳静;袁峗;;数字图书馆中大数据存储的应用研究[J];科技与企业;2013年18期

3 张少敏;李晓强;王保义;;基于Hadoop的智能电网数据安全存储设计[J];电力系统保护与控制;2013年14期

4 王苏卫;;基于Hadoop和Hive的电信行业数据仓库研究[J];电子技术与软件工程;2013年11期

5 张春明;芮建武;何婷婷;;一种Hadoop小文件存储和读取的方法[J];计算机应用与软件;2012年11期



本文编号:1553165

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/1553165.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户74955***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com