当前位置:主页 > 科技论文 > 计算机论文 >

面向地址标定的通联日志分布式存储检索系统的设计与实现

发布时间:2018-11-10 10:15
【摘要】:随着互联网的高速发展,网络应用和网络流量不断增长,给人类社会生活、经济带来便利的同时,也给网络管理和网络安全带来巨大的挑战。通联日志是网络会话产生的日志,能很好地从会话级别描述网络。因此如何为高速、长期的通联日志提供可靠存储,如何基于通联日志准确地对IP社会属性进行标定,对于网络安全、网络管理以及网络规划都起着至关重要的作用。 现存的面向通联日志的存储方案没有很好地兼顾接收、存储和检索等方面,使得它们的可移植性受到限制。多个骨干网生成的通联日志每秒钟可以达到千万级别,这使得传统集中式存储方案越来越不能满足需求。而后的一些方案借助分布式框架提供可扩展的存储性能,但是这些方案的存储引擎大多基于传统关系型数据库,存储性能有限。本文对通联日志存储方案进行深入学习,以实现支持高速存储、高速查询的分布式通联日志存储系统。另外,本文对传统IP社会属性标定进行深入研究,发现传统的基于端口和行为特征的IP社会属性标定分辨率较低。本文对通联日志进行详细分析,以实现更加准确的IP社会属性标定。本文的主要的研究内容可以归纳为: (1)提出新型的高速通联日志接收框架:DPIO (Driect Packet I/O)。尽管基于传统的Socket API可以比较简单实现通联日志接收,但是其性能不高。而新型网络驱动netmap,可以很好地解决这个问题,但是netmap需要单独维护网卡驱动,实现和维护都比较困难。本文在它们的基础上提出一种新型的通联日志接收框架,实验结果表明DPIO既能解决Socket API接收速率低的问题,同时也能避免netmap的复杂性。 (2)设计并实现支持高速存储和快速检索的分布式通联日志存储系统:DCLStore (A Distributed Connection Log Storage System Supports High-speed storage and Fast Retrieval)。DCLStore能够为通联日志提供高速存储能力、高速检索能力,并且通过存储节点的动态增加提供可扩展的存储空间。实验结果,本系统每秒钟可以接收大约2000万条通联日志,并能很好的处理多个网络节点的日志。在查询时可以提供比相同存储容量下的单点存储系统高40倍的查询响应速度。 (3)提出新型的IP指标:IP明暗度。传统的基于端口和行为特征的IP社会属性标定虽然实现简单,但是准确率不高。本文首先对通联日志进行深入观察,提出一种新型的IP指标:IP明暗度。而后,本文对全网IP的指标进行基础测量,测量结果表明本文对通联日志处理的结果基本正确。最后,本文利用开源工具对全网IP的明暗度进行计算,并考察其对IP属性标定的影响。实验结果表明,IP明暗度对基于端口和行为特征的IP社会属性标定的结果都有很大的影响。
[Abstract]:With the rapid development of the Internet, network application and network traffic are increasing, which brings convenience to human life and economy, and also brings great challenges to network management and network security. The communication log is the log generated by the network session and can describe the network from the session level. Therefore, how to provide reliable storage for high-speed, long-term communication logs, and how to accurately calibrate the social attributes of IP based on communication logs play an important role in network security, network management and network planning. The existing storage schemes for communication logs do not take account of reception, storage and retrieval, so their portability is limited. Communication logs generated by multiple backbone networks can reach tens of millions of levels per second, which makes the traditional centralized storage scheme more and more unable to meet the requirements. Then some schemes provide extensible storage performance with the help of distributed framework, but most of the storage engines of these schemes are based on traditional relational databases, and the storage performance is limited. In order to realize the distributed communication log storage system which supports high speed storage and high speed query, this paper makes a deep study on the communication log storage scheme. In addition, the traditional IP social attribute calibration is deeply studied in this paper. It is found that the traditional IP social attribute calibration based on port and behavior features has lower resolution. In this paper, the connection log is analyzed in detail in order to achieve more accurate IP social attribute calibration. The main research contents of this paper can be summarized as follows: (1) A new high-speed communication log receiving framework,: DPIO (Driect Packet I / O, is proposed. Although the traditional Socket API can easily realize the communication log reception, but its performance is not high. The new network driver netmap, can solve this problem well, but the netmap needs to maintain the network card driver alone, so it is difficult to implement and maintain. In this paper, we propose a new communication log receiving framework based on them. The experimental results show that DPIO can not only solve the problem of low receiving rate of Socket API, but also avoid the complexity of netmap. (2) Design and implement the distributed connection log storage system (: DCLStore (A Distributed Connection Log Storage System Supports High-speed storage and Fast Retrieval). DCLStore), which supports high speed storage and fast retrieval, can provide high speed storage capacity and high speed retrieval ability for communication log. And provides the extensible storage space through the dynamic increase of the storage node. Experimental results show that the system can receive about 20 million communication logs per second, and can handle the logs of multiple network nodes well. The query response speed is 40 times faster than the single point storage system with the same storage capacity. (3) A new IP index, IP lightness and darkness, is proposed. The traditional IP social attribute calibration based on port and behavior is simple, but the accuracy is not high. In this paper, a new IP index, IP lightness and darkness, is proposed. Then, this paper carries on the basic measurement to the whole network IP index, the measurement result shows that the result of this paper is basically correct to the connection log processing. Finally, the open source tools are used to calculate the lightness of IP in the whole network and its influence on IP attribute calibration is investigated. The experimental results show that the IP lightness has great influence on the results of IP social attribute calibration based on port and behavioral characteristics.
【学位授予单位】:北京邮电大学
【学位级别】:硕士
【学位授予年份】:2015
【分类号】:TP333

【参考文献】

相关期刊论文 前2条

1 李现艳;赵书俊;初元萍;;基于MySQL的数据库服务器性能测试[J];核电子学与探测技术;2011年01期

2 徐非,杨广文,鞠大鹏;基于Peer-to-Peer的分布式存储系统的设计[J];软件学报;2004年02期



本文编号:2322178

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/2322178.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户ee8a2***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com