基于网格的工业数据传输性能优化
发布时间:2018-03-07 21:00
本文选题:工业数据压缩 切入点:磁盘调度 出处:《南京信息工程大学》2014年硕士论文 论文类型:学位论文
【摘要】:随着网格计算的高速发展,大规模实时工业数据分析在工业生产过程中起到了举足轻重的作用,通过实时监控分析工业数据,能够极大的指导生产,提高生产力。然而在实时工业数据分析领域主要还存在几大问题:第一,实时数据生成的速度非常快,占用大量的系统存储空间,如何在不丢失数据特征的情况下尽量减小存储压力是一个急需解决的问题。第二,在大规模的实时数据的传输及处理过程中,会在采集器与中心服务器间形成千上万的并发数据连接,此时服务器的稳定性、高效性至关重要。第三,传统的TCP协议参数已经不能适应当前的高速网络,如何优化协议使得网络传输的吞吐量更大。在本文中我们分别从数据压缩、网格服务器性能以及网络传输性能三个方面进行阐述我们网格计算相关的优化成果。本文主要完成了以下工作: (1)工业数据实时性要求较高,会随着时间形成一条曲线。工业产生的实时数据量非常大,不可以通过缓存大量数据后再分析整个数据的走向特性选择对应的压缩方案,整个压缩算法的处理过程必须具有高压缩比和低资源消耗。本文提出了针对工业实时数据压缩的曲线面积映射压缩法。根据测试结果,算法压缩比是传统SDT压缩法的2.16倍。 (2)在数据网格环境中,服务器的性能一直是整体网格性能的关键要素之一。在本文中我们分析了网格服务器常见的性能瓶颈并在我们的网格环境中引入了用户空间I/O调度、零拷贝、事件驱动架构技术来提高网格服务器的性能。在大量小文件读取的情况下引入用户空间I/O调度能够节省近50%的磁盘I/O时间。通过零拷贝,网格服务器可以减少CPU在大量内核与用户空间切换之间引起的消耗,节省63%无用的上下文切换。事件驱动架构可以减少30%的CPU利用率并且达到线程驱动的最佳吞吐量。通过结合上面提到的这三种优化方法应用到我们的网格服务器中,新的解决方案可以在只消耗传统解决方案所占用CPU使用率的70%的情况下,系统饱和吞吐量比传统方案提高30%。 (3)网络流量测试表明,当前网络95%的数据流为TCP流,其它的为UDP或其它形式的数据流,因此TCP协议传输性能成为了制约整个网络数据传输性能的关键。通过分析TCP协议及其多个变种,纵向比较它们的拥塞控制算法,着重讨论了影响TCP协议性能的网络因素以及在不同网络环境下TCP协议的网络性能表现。实现了当前网络传输常见场景下广泛采用的相关TCP优化方案,并提出根据神经网络来预测丢包在保持链路公平的前提下来提高TCP吞吐量的方案,通过网络模拟实验显示,预测的一级命中率最高可以达到74%。
[Abstract]:With the rapid development of grid computing, large-scale real-time industrial data analysis plays an important role in the industrial production process, through real-time monitoring and analysis of industrial data, can greatly guide production, However, there are still several problems in the field of real-time industrial data analysis. First, the speed of real-time data generation is very fast, and it takes up a lot of system storage space. How to minimize the storage pressure without losing data features is an urgent problem. Second, in the process of large-scale real-time data transmission and processing, There will be thousands of concurrent data connections between the collector and the central server, so the stability and efficiency of the server is crucial. Third, the traditional TCP protocol parameters can no longer adapt to the current high-speed network. How to optimize the protocol to increase the throughput of network transport. Three aspects of grid server performance and network transmission performance are described in this paper. The main work of this paper is as follows:. (1) Industrial data requires high real-time performance and will form a curve with time. The amount of real time data produced by industry is very large, so it is not possible to choose the corresponding compression scheme by caching a large amount of data and then analyzing the trend characteristics of the whole data. The processing process of the whole compression algorithm must have high compression ratio and low resource consumption. In this paper, a curve area mapping compression method for industrial real-time data compression is proposed. According to the test results, the compression ratio of the algorithm is 2.16 times that of the traditional SDT compression method. In the data grid environment, the performance of the server is always one of the key elements of the overall grid performance. In this paper, we analyze the common performance bottlenecks of the grid server and introduce the user-space I / O scheduling into our grid environment. Zero copy, event driven architecture technology to improve the performance of grid servers. In the case of a large number of small files read, the introduction of user space I / O scheduling can save nearly 50% disk I / O time. Grid servers can reduce the cost of CPU between a large number of kernel and user space switching, Save 63% useless context switching. Event-driven architecture can reduce the CPU utilization of 30% and achieve the best thread-driven throughput. The new solution can increase the system saturation throughput by 30% over the traditional solution when only 70% of the CPU usage is consumed by the traditional solution. Network traffic tests show that the current network 95% data flow is TCP flow, others are UDP or other forms of data flow, Therefore, the transmission performance of TCP protocol has become the key to restrict the data transmission performance of the whole network. By analyzing the TCP protocol and its variety, the congestion control algorithm of TCP protocol is compared longitudinally. The network factors that affect the performance of TCP protocol and the performance of TCP protocol in different network environments are discussed emphatically, and the related TCP optimization schemes widely used in current network transmission scenarios are realized. A scheme is proposed to improve the throughput of TCP based on neural network to predict packet loss under the premise of maintaining the fairness of the link. The network simulation results show that the highest hit rate of the first class can reach 74%.
【学位授予单位】:南京信息工程大学
【学位级别】:硕士
【学位授予年份】:2014
【分类号】:TP333;TP393.05
【参考文献】
相关期刊论文 前4条
1 刘红霞;牛富丽;;实时数据库数据压缩算法探讨与改进[J];化工自动化及仪表;2010年06期
2 钱笑宇,张彦武;工业实时数据库的研究和设计[J];计算机工程;2005年01期
3 赵利强;于涛;王建林;;基于SQL数据库的过程数据压缩方法[J];计算机工程;2008年14期
4 熊永华;吴敏;贾维嘉;;基于延时预测的TCP实时视频传输方法[J];中南大学学报(自然科学版);2010年04期
,本文编号:1580931
本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/1580931.html