多线程锁同步运行时特征分析与调优机制研究
本文关键词: 多线程程序 特征研究 性能调试 数据竞争 不当锁同步 出处:《华中科技大学》2016年博士论文 论文类型:学位论文
【摘要】:随着多核/众核处理器的出现,计算机的处理能力得到大幅增强。为了充分地发挥多核/众核处理器强大的计算能力,多线程编程技术应运而生,但同时也引入了线程间通信问题,因此,设计出了锁同步机制来协调线程间的通信。在多线程编程语言中,锁同步机制的核心理念是保证不同线程对同一共享资源的冲突访问以互斥的方式进行。尽管锁同步机制保证了线程间通信的正确性,但由于多线程程序运行时调度的随机性与复杂性,在程序动态执行的过程中,现有的锁同步机制也同时带来了大量的非冲突型互斥执行,即不当锁同步。不当锁同步指由同一锁保护的多个临界区并未同时访问同一共享资源。程序运行时形成的不当锁同步带来了诸多负面影响:1)在程序性能影响方面,由于不当锁同步保护的多个临界区没有对同一共享资源进行冲突访问,因此这些临界区原本可并行执行,然而锁同步机制的互斥保护使得它们以串行的方式执行,因此影响了多线程程序的性能。特别地,在锁密集型的多线程程序中存在着大量的不当锁同步,这对多线程程序的性能造成了严重的影响。2)在数据竞争检测方面,现阶段主流的动态数据竞争检测方法大都基于Happens-Before (HB)关系实现。该检测方法通常对不同线程间的解锁和加锁事件形成偏序时序关系,但凡两个程序事件没有偏序关系同时又冲突访问了同一共享资源,那么它们便构成一个数据竞争。然而,由于不当锁同步可并行化的特征,HB模型的强边界时序关系通常会漏掉大量的数据竞争。此外,不当锁同步在程序动态执行过程中有指数量级变化的可能性,因此高效地识别不当锁同步并非一件易事。围绕上述若干锁同步相关问题,开展了“多线程锁同步运行时特征分析与调优机制研究”的课题,主要包括以下三方面:在特征研究方面,首次对多线程锁同步运行时特征进行了系统的分析,特别是不当锁同步运行时的特征分析。具体来说,以若干真实的(real-world)多线程程序(例如OpenLDAP、mysq1、pbzip2、transmissionBT、handbrake)为基准程序,对它们中的锁同步进行测试、跟踪、收集,通过观察进一步对锁同步运行时特征进行总结,具体包括:锁同步特征分类及其表现形式、产生原因、系统影响、防范策略和可能的修复措施等。特征研究同时揭示了11个锁同步运行时特征相关的新观察,基于收集到的观察结果探讨了其所折射的研究蕴意。通过对多线程锁同步运行时特征研究,加强了对不当锁同步的认识与理解,对解决不当锁同步的程序影响有着重要的指导意义。在性能调试方面,针对不当锁同步所造成的性能影响问题,提出了基于记录/重放技术的性能调试方法-PerfPlay。该方法的核心思想如下:首先,记录下含有不当锁同步性能问题的原始程序执行轨迹文件:其次,利用拓扑图分析技术消除其中的不当锁同步执行序列,使之变为不含有不当锁同步性能问题的程序执行轨迹文件;接着,对原始和修改后的程序执行轨迹文件重放;最后,对比分析两次重放结果进而定量分析出不当锁同步所造成的净性能损耗。实验结果表明,PerfPlay方法:1)有着很高的性能稳定性与性能精确性,进而保证了重放分析的性能保真度;2)以低(4.3%)运行时锁集开销,推荐的不当锁同步代码段有很高的优化价值;3)案例分析的结果也进一步逆向证明了PerfPlay在发掘不当锁同步方面的有效性。在数据竞争检测方面,提出了面向不当锁同步的弱时序边界Happens-Before关系-ULCP-HB, ULCP-HB关系改善了传统HB关系的强时序边界特点,其能穿透不当锁同步交错解锁和加锁事件之间形成的偏序关系并使之并行化。为了实现ULCP-HB关系,结合不当锁同步运行时特征,进一步提出了在线启发式分析与离线重排序分析相结合的轻量级数据竞争检测方法,该方法在几乎不引入额外运行时开销的情况下能够发掘出因不当锁同步而隐藏的数据竞争。实验表明,相比于HB检测方法,ULCP-HB能发掘出额外19.8%的数据竞争;且在几乎不引入(4.45%)运行时分析开销的情况下,能够节约51.0%重排序开销和52.3%执行轨迹文件大小。综上所述,围绕多线程程序动态运行过程中产生的不当锁同步现象,从基础研究(即多线程锁同步运行时特征研究)和扩展研究(包括面向不当锁同步的性能调试技术和数据竞争检测两方面的研究)两个方面对不当锁同步进行了全面又深入的系统分析,这些研究加强了程序设计人员对不当锁同步的运行时行为及程序影响的理解,进而帮助程序设计人员有效地修复不当锁同步相关的程序影响。
[Abstract]:With the advent of multi-core / many core processor, computer processing power has been greatly enhanced. In order to make full use of multi-core / many core processor computing power came into being strong, multi thread programming technology, but also introduced the communication problem between threads, therefore, designed a lock synchronization mechanism to coordinate the communication between threads. In a multithreaded programming language, the core concept of lock synchronization mechanism is to ensure that different threads of the same access to shared resources conflict mutually exclusive. Although the lock synchronization mechanism to ensure the correctness of communication between threads, but due to the randomness of scheduling in multi thread programs and complexity in the process of dynamic program execution in the existing lock synchronization mechanism also brings lots of non conflict mutex implementation, namely the improper lock synchronization. Improper lock synchronization refers to the same lock protection a critical region did not visit at the same time The same shared resources. Improper lock is formed when running the synchronization has brought many negative effects: 1) in terms of program performance impact due to a number of critical region protection no improper lock synchronization conflicts access to the same shared resources, so these critical areas could be performed in parallel, but the mutex protection lock synchronization mechanism make them perform in serial mode, thus affecting the performance of multithreaded programs. In particular, in the lock intensive multi-threaded program in a large number of improper lock synchronization, the multi thread program performance caused by the influence of the severity of.2) in terms of data race detection, dynamic data race detection method at present. The mainstream is mostly based on the Happens-Before (HB). The relationship between detection methods are usually on different threads of unlocking and locking events in the formation of partial order relations, whenever two program events not ordered off The Department also visited the same conflict of shared resources, then they form a data race. However, due to improper lock synchronization can be parallel features, strong boundary timing HB model usually missed the competition a large amount of data. In addition, the improper lock possibility of the implementation of index magnitude change during the process in dynamic program simultaneously, so efficiently identify improper lock synchronization is not an easy task. On the above several lock synchronization related issues, carried out research on Characteristic Analysis and optimization mechanism of multi thread lock synchronization operation subject, mainly includes the following three aspects: the characteristics of research, for the first time on the multi thread lock synchronization operation characteristics of the system the analysis, especially the analysis of the characteristics of improper lock synchronization when running. Specifically, a number of real (real-world) multithreaded programs (such as OpenLDAP, mysq1, pbzip2, transmissionBT, handbrak E) as a benchmark program, for they lock synchronization testing, tracking, collection, through the observation of further lock synchronization operation characteristics were summarized, including: Lock synchronization feature classification and its manifestations, causes, impact, prevention strategies and possible repair measures. The study also revealed 11 features to observe the new lock synchronization runtime features related to the observation results, based on the collected on the meaning of refraction. By studying the characteristic of multi thread lock synchronization operation, strengthen the improper lock synchronization understanding and understanding, has important significance to solve the improper lock synchronization procedures. In the performance of debugging, to solve the problem of performance caused by improper lock synchronization, puts forward the core idea of -PerfPlay. performance debugging method the method of recording / playback technology based on record contains as follows: first of all, Improper lock synchronization performance problems of the original program execution trace file. Secondly, using topology analysis technology to eliminate the improper lock synchronization execution sequence, which was not with improper lock synchronization performance issues program execution path file; then, the original and modified program execution trace file playback; finally, the comparative analysis of the two the results of quantitative analysis and replay time net performance loss caused by improper lock synchronization. The experimental results show that the PerfPlay method: 1) with the stability and performance of high accuracy, and ensure the performance of fidelity replay analysis; 2) to low (4.3%) runtime overhead lock set, recommended the improper lock synchronization code optimization has very high value; 3) the case analysis results further proved that PerfPlay reverse in exploring the effectiveness of the improper lock synchronization. In the data race detection aspect, put forward The improper lock synchronization weak sequence boundary Happens-Before relations -ULCP-HB, ULCP-HB improved strong temporal boundary characteristics of traditional HB relation, partial order relation between the formation through improper locking and unlocking staggered lock synchronization events and make parallel. In order to realize the ULCP-HB relationship, combined with the improper lock synchronization operation characteristics, further put forward the online analysis and offline heuristic reordering lightweight data analysis combined with the competition detection method, we can discover hidden due to improper competition and the method of data synchronization overhead in almost no additional run-time conditions. Experimental results show that compared to the HB detection method, ULCP-HB can find out the competition 19.8% additional data; and in almost no introduction (4.45%) runtime analysis overhead, can save 51% reordering overhead and 52.3% track file execution. In summary, The improper lock synchronization phenomenon of dynamic multi thread program is running in the process, from basic research (i.e. the characteristics of multi thread lock synchronization operation) and extended research (including research on improper lock synchronization performance debugging technology and data race detection two) two aspects of improper lock synchronization to conduct a comprehensive and in-depth the research of system analysis, strengthen the effect of program design of program behavior and improper lock synchronization operation of the understanding, and help programmers effectively fix influence of improper lock synchronization related.
【学位授予单位】:华中科技大学
【学位级别】:博士
【学位授予年份】:2016
【分类号】:TP332
【相似文献】
相关期刊论文 前10条
1 张利霞;多线程的实现方法[J];河南师范大学学报(自然科学版);2001年02期
2 赵海延;多线程及其实现方法[J];武汉工程职业技术学院学报;2002年03期
3 李学坤;数据采集处理系统中多线程的效率研究[J];工业控制计算机;2003年04期
4 徐洪斌,苏铁熊,董小瑞;多线程技术及其实现[J];山西电子技术;2003年03期
5 李文亮,闫宏印;多线程技术及其在多媒体CAI软件中的应用[J];太原理工大学学报;2003年05期
6 周亦敏,张生;集散系统中基于多线程的多机串行通信实现[J];上海理工大学学报;2003年04期
7 高正光,李启炎;一种多线程并发环境下的对象缓存模型[J];计算机工程;2005年22期
8 王世强;曹英;王宏;;基于多线程的肌电信号实时采集与分析系统[J];仪器仪表学报;2006年S2期
9 李婷;虞钢;;脉搏检测分析系统中基于多线程的高速串口通信[J];计算机应用与软件;2007年03期
10 张跃平;;多线程设计中的克隆技术[J];信息技术;2007年04期
相关会议论文 前10条
1 谭小彬;孔德光;奚宏生;;多线程程序时序的统计分析[A];第二十七届中国控制会议论文集[C];2008年
2 贾韶旭;潘锦;;多线程技术在探地雷达中的应用[A];2007年全国微波毫米波会议论文集(下册)[C];2007年
3 胡杏;胡瑜;李晓维;;基于存储级并行的同时多线程电压紧急容错技术[A];第十四届全国容错计算学术会议(CFTC'2011)论文集[C];2011年
4 周大刚;龙昭华;;多线程在无线网络处理中的应用[A];’2004计算机应用技术交流会议论文集[C];2004年
5 李s,
本文编号:1452095
本文链接:https://www.wllwen.com/shoufeilunwen/xxkjbs/1452095.html