高考数据分布式存储优化的设计与实现
[Abstract]:In recent years, the rapid development of information industry in various industries has given birth to the explosive growth of industry data, including, of course, the field of college entrance examination. As we all know, the college entrance examination every year will produce a huge amount of college entrance examination data, how to store these large amounts of college entrance examination data quickly and efficiently is an important topic worth studying. In the face of TB level or even PB level of massive data, the traditional relational database data storage capacity is increasingly weak. With the emergence of large-scale data, the emergence of a lot of data storage technology. Among them, Google's GFS and Apache's HDFS are two typical big data distributed storage technologies. The emergence of HDFS.HDFS, which is now a popular Apache company, allows enterprises to use clusters of cheap machines to store large amounts of data in a distributed manner. But the distributed file storage of HDFS is controlled by one master node, and the storage mode of multiple slave data nodes is prone to the bottleneck problem of master node. For the college entrance examination data studied in this paper, if we use HDFS to store a large amount of college entrance examination data, when a large number of candidates simultaneously online query results, the requests from different clients will flood into the main node of HDFS. This is a great challenge for the master node of HDFS. In view of the above problems, through the in-depth study and analysis of HDFS distributed storage technology, this paper proposes a distributed storage scheme of HDFS MongoDB to solve the bottleneck problem of HDFS master node, thus making the distributed storage of college entrance examination data more optimized. Examinee inquiry results are more efficient. Based on the above analysis, the main research work of this paper is as follows: (1) firstly, the background and significance of the topic are defined, and then the distributed storage technology, the college entrance examination information technology, are applied to the thesis. And the development of Spark big data platform technology is analyzed. (2) the bottleneck problem of main node in storing college entrance examination data using HDFS distributed storage technology is analyzed. Then an optimization scheme of distributed storage of college entrance examination data based on HDFS MongoDB is proposed. (3) according to the specific requirements of the college entrance examination institute, the query system of college entrance examination results using the optimized storage scheme is from the user's point of view, functional point of view, According to the requirement analysis, the system structure, the system function and the function of the system are analyzed in detail from the point of view of non-functional. The system database and HDFS MongoDB distributed storage are designed in detail. (4) based on the detailed design of the system, the implementation method is given. The function of the system is tested by using the black box test method, and the performance of the system is tested from three aspects: response time, throughput and concurrency. Finally, the main research content of this paper is briefly described, and the direction of the next efforts is defined.
【学位授予单位】:山东师范大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP333
【参考文献】
相关期刊论文 前10条
1 邱丽娟;;大数据处理平台Spark基础实践研究[J];无线互联科技;2017年01期
2 蔡文涛;;Spark大数据处理平台的构建及应用[J];中国新通信;2016年15期
3 王康;李东静;陈海光;;分布式存储系统中改进的一致性哈希算法[J];计算机技术与发展;2016年07期
4 贺静霞;;高考教育功能的异化及其回归[J];中国教育学刊;2016年05期
5 Yang Liu;Feng Yang;;Scala tympani drill-out technique for oval window atresia with malformed facial nerve: A report of three cases[J];Journal of Otology;2015年04期
6 刘峰波;;大数据Spark技术研究[J];数字技术与应用;2015年09期
7 陈豪;谢晓兰;;基于云技术的校园服务系统服务器端设计研究[J];科技视界;2015年09期
8 郭昕;朱春晖;;从教育公平视角探讨我国的高考制度改革[J];湖南科技大学学报(社会科学版);2014年02期
9 王月春;;基于HDFS的远程教育课件资源管理[J];网络安全技术与应用;2013年09期
10 李唯;;学生成绩管理系统的设计与实现[J];软件导刊;2012年12期
相关会议论文 前1条
1 田原;王营康;肖达;杨榆;;云存储系统中的存储与数据拆分方案[A];第十九届全国青年通信学术年会论文集[C];2014年
相关重要报纸文章 前2条
1 严雪林;;中国企业大数据应用现状及趋势[N];现代物流报;2014年
2 别志铭;;基于云的大数据分析系统[N];网络世界;2013年
相关硕士学位论文 前10条
1 崔鑫;海量空间数据的分布式存储管理及并行处理技术研究[D];国防科学技术大学;2010年
2 童明;基于HDFS的分布式存储研究与应用[D];华中科技大学;2012年
3 郭匡宇;基于MongoDB的传感器数据分布式存储的研究与应用[D];南京邮电大学;2013年
4 段弘;基于Play的用户匹配与内容推荐系统设计与实现[D];电子科技大学;2013年
5 李吉檀;基于教育公平视角的我国异地高考制度研究[D];武汉理工大学;2013年
6 唐振坤;基于Spark的机器学习平台设计与实现[D];厦门大学;2014年
7 高正九;基于HDFS的云存储系统的设计与实现[D];中国科学技术大学;2014年
8 孔晓斌;面向高考招生的智能数据分析系统研究[D];太原科技大学;2008年
9 吕林;基于MongoDB的应用平台的研究与实现[D];北京邮电大学;2015年
10 李爽;基于Spark的数据处理分析系统的设计与实现[D];北京交通大学;2015年
,本文编号:2408652
本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/2408652.html