基于Hadoop存储的文件管理系统的研究与实现
发布时间:2018-01-27 15:14
本文关键词: 文件管理 Hadoop存储 分割上传 XML信息存储 出处:《华中科技大学》2013年硕士论文 论文类型:学位论文
【摘要】:互联网技术的日新月异和云存储技术的突飞猛进,带来了数据存储方式的巨大变革,促进了网络硬盘的快速发展。网络硬盘改变着人们进行文件管理的方式,而针对中小企业海量数据的文件管理系统发展却相对滞后,因此,研究适用于中小企业的文件管理系统尤为重要。 本文在分析了文件管理系统的功能和性能需求的基础上,给出了系统的总体架构方案、开放型结构封装方案、文件上传方案和用户信息管理优化方案,并对上述方案进行了具体实现。在总体方案中,设计了总体功能结构方案,并根据功能结构划分设计了服务器部署方案;在开放型结构封装中,将底层功能接口资源化,实现资源的远程调用,并将资源调用方法封装成开发包,供其它开发平台使用,增强系统的开放性;在文件上传中,进行文件动态分割,实现文件多线程上传,利用增量算法实现文件断点续传,提高文件上传效率,结合消息队列管理服务,实现文件离线传输到Hadoop分布式存储服务器;在用户信息管理中,建立用户与文件信息的两级索引,将结构化的数据转换为半结构化的树目录数据存储,并实现半结构化数据XML的维护和解析。 本文研究和实现的基于Hadoop存储的文件管理系统具有开放型封装结构、文件高效上传和用户信息快速获取等优点,,在功能和性能测试上,达到了预期目标,能够适用于中小企业海量数据的文件管理。
[Abstract]:With the rapid development of Internet technology and cloud storage technology, great changes have been made in the way of data storage, which has promoted the rapid development of network hard disk. However, the development of file management system for large amount of data in small and medium-sized enterprises is lagging behind. Therefore, it is very important to study the file management system for small and medium-sized enterprises. Based on the analysis of the function and performance requirements of the file management system, this paper presents the overall architecture of the system, the open structure encapsulation scheme, the file upload scheme and the user information management optimization scheme. In the overall scheme, the overall functional structure scheme is designed, and the server deployment scheme is designed according to the functional structure division. In the open structure encapsulation, the bottom function interface is reused to realize the remote resource call, and the resource transfer method is encapsulated into the development package, which can be used by other development platforms to enhance the openness of the system. In the file upload, file dynamic segmentation, file multi-thread upload, the use of incremental algorithm to achieve file breakpoint continuation, improve the efficiency of file upload, combined with message queue management services. The file is transferred offline to the Hadoop distributed storage server. In user information management, the two-level index of user and file information is established, the structured data is converted into semi-structured tree directory data storage, and the XML of semi-structured data is maintained and parsed. The file management system based on Hadoop storage, which is researched and implemented in this paper, has the advantages of open encapsulation structure, efficient file upload and quick access to user information. To achieve the desired goal, can be used for the management of large amounts of data file management in small and medium-sized enterprises.
【学位授予单位】:华中科技大学
【学位级别】:硕士
【学位授予年份】:2013
【分类号】:TP333;TP311.52
【参考文献】
相关期刊论文 前9条
1 李正涛;;OA系统发展历程与趋势[J];办公自动化;2008年08期
2 张立;;信息存储技术的现状及发展[J];信息记录材料;2006年05期
3 程祥;;Struts、Hibernate和Spring的轻型J2EE架构的研究[J];电脑编程技巧与维护;2007年04期
4 林丽华;;借助新浪邮箱的文件中转站打造安全下载服务[J];电脑迷;2010年24期
5 王鹤群;;云存储的应用[J];记录媒体技术;2008年05期
6 袁绪峰;;基于Spring框架的AOP编程[J];计算机与现代化;2006年01期
7 廖乐林;;断点续传的原理探讨和编程应用[J];科技信息(科学教研);2007年23期
8 王迪;舒继武;沈美明;;一种SAN环境下数据备份系统的设计与实现[J];小型微型计算机系统;2006年09期
9 沈拓;;深化改革 推动产业繁荣——十八大后中国电信行业发展趋势预测[J];中国电信业;2012年12期
本文编号:1468766
本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/1468766.html