当前位置:主页 > 管理论文 > 移动网络论文 >

基于Hadoop的网络流量数据处理系统的实现与应用

发布时间:2018-07-31 15:32
【摘要】:历经多年发展,我国互联网已成为全球互联网发展的重要组成部分。截止至2013年6月底,中国网民规模已达到5.91亿,互联网普及率约为44.1%。在互联网高速发展的同时,所暴露出来的问题也日益突出。一方面,不断增加的用户数量和层出不穷的新兴业务,使得互联网流量数据激增,网络拥塞的情况日益频繁,对网络服务质量提出了更高的要求。另一方面,由于互联网体系结构的复杂化,使得对于互联网流量特性、用户行为特征、新兴业务的流量特征等问题都还缺乏深入的理解和精确的描述,从而严重影响了互联网的进一步发展和网络资源的有效利用。与此同时,由于网络流量的剧增,传统的流量分析方法已无法满足海量数据的存储和处理要求,需要引入更高效、更可靠的方式进行处理。而Hadoop正是一个能够对海量数据进行可靠的分布式处理的可扩展开源软件框架,并已经被应用于越来越多的研究领域。 本文首先介绍了Hadoop的基本概念,包括Hadoop和HBase的工作原理。 随后,在Hadoop技术的基础上,本文提出了网络流量处理系统的三层体系结构,将网络流量的采集、存储、处理和分析等独立的功能整合到一起,形成具备完整功能的网络流量处理系统。 接着,本文对网络流量处理系统的数据层进行了重点研究。先后详细介绍了数据层的非实时组件——基于Hadoop的网络流量数据控制组件,以及实时组件——基于HBase的流记录控制组件。通过对这两个组件的研究,解决了海量网络流量分析领域中的一些重要问题。 最后,本文以智能终端流量特征分析为例对网络流量处理系统的应用层进行了说明。
[Abstract]:After years of development, China's Internet has become an important part of the global Internet development. By the end of June 2013, China's Internet users had reached 591 million and Internet penetration was about 44.1 percent. In the rapid development of the Internet at the same time, exposed problems are also increasingly prominent. On the one hand, the increasing number of users and emerging services make the Internet traffic data surge, network congestion increasingly frequent, put forward higher requirements for the quality of network service. On the other hand, due to the complexity of Internet architecture, there is a lack of in-depth understanding and accurate description of Internet traffic characteristics, user behavior characteristics, traffic characteristics of emerging services, and so on. This has seriously affected the further development of the Internet and the effective use of network resources. At the same time, due to the rapid increase of network traffic, the traditional traffic analysis method can no longer meet the requirements of mass data storage and processing, so it is necessary to introduce a more efficient and reliable way to process it. Hadoop is a scalable open source software framework which can process massive data reliably and has been applied in more and more research fields. This paper first introduces the basic concepts of Hadoop, including the working principle of Hadoop and HBase. Then, on the basis of Hadoop technology, this paper proposes a three-layer architecture of network traffic processing system, which integrates the independent functions of network traffic collection, storage, processing and analysis. Form a complete function of the network traffic processing system. Then, this paper focuses on the data layer of network traffic processing system. The non-real-time component of the data layer, the network traffic data control component based on Hadoop, and the real-time component, the flow record control component based on HBase, are introduced in detail. Through the research of these two components, some important problems in the field of mass network traffic analysis are solved. Finally, the application layer of network traffic processing system is illustrated with the analysis of intelligent terminal traffic characteristics.
【学位授予单位】:北京邮电大学
【学位级别】:硕士
【学位授予年份】:2014
【分类号】:TP393.06

【相似文献】

相关期刊论文 前10条

1 张大方;沈永坚;黎文伟;;一种基于历史记录的网络流量数据采样方法[J];湖南大学学报(自然科学版);2005年06期

2 吴亚东,孙世新;低分辨率小规模网络流量数据的混沌特性鉴别[J];计算机应用研究;2005年09期

3 杨波;刘渊;;基于算术平均值的网络流量数据采样方法[J];微计算机信息;2007年24期

4 张瑞;胡蓉;;基于季节时间序列模型的网络流量实证分析[J];四川文理学院学报;2012年05期

5 唐红,吴勇军;利用数据仓库技术实现网络流量数据分析[J];华中科技大学学报(自然科学版);2003年11期

6 欧阳e,

本文编号:2156017


资料下载
论文发表

本文链接:https://www.wllwen.com/guanlilunwen/ydhl/2156017.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户6b73e***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com