PRF:a process-RAM-feedback performance model to reveal bottl
发布时间:2021-07-03 13:41
Performance models provide insightful perspectives to predict performance and to propose optimization guidance. Although there has been much researches, pinpointing bottlenecks of various memory access patterns and reaching high accurate prediction of both regular and irregular programs on various hardware configurations are still not trivial. This work proposes a novel model called process-RAM-feedback(PRF) to quantify the overhead of computation and data transmission time on general-purpose mu...
【文章来源】:High Technology Letters. 2020,26(03)EI
【文章页数】:14 页
【文章目录】:
0 Introduction
1 Related work
2 The PRF performance model
2.1 Process phase
2.2 RAM phase
2.3 Feedback optimization phase
3 Experimental testbed
4 Validation
4.1 Convolution
4.1.1 Convolution operation
4.1.2 Performance prediction for naive code
4.1.3 Optimization method and modeling analysis
4.1.4 Optimization guidance
4.2 SpMV
4.2.1 Test matrices
4.2.2 Performance prediction and bottlenecks
4.2.3 Feedback performance optimization
4.3 Sn-sweep
4.3.1 Sn-sweep operation
4.3.2 Performance prediction and bottleneck analysis
4.3.3 Optimization method and feedback
5 Conclusion
【参考文献】:
期刊论文
[1]Automatic tuning of sparse matrix-vector multiplication on multicore clusters[J]. LI ShiGang,HU ChangJun,ZHANG JunChao,ZHANG YunQuan. Science China(Information Sciences). 2015(09)
本文编号:3262692
【文章来源】:High Technology Letters. 2020,26(03)EI
【文章页数】:14 页
【文章目录】:
0 Introduction
1 Related work
2 The PRF performance model
2.1 Process phase
2.2 RAM phase
2.3 Feedback optimization phase
3 Experimental testbed
4 Validation
4.1 Convolution
4.1.1 Convolution operation
4.1.2 Performance prediction for naive code
4.1.3 Optimization method and modeling analysis
4.1.4 Optimization guidance
4.2 SpMV
4.2.1 Test matrices
4.2.2 Performance prediction and bottlenecks
4.2.3 Feedback performance optimization
4.3 Sn-sweep
4.3.1 Sn-sweep operation
4.3.2 Performance prediction and bottleneck analysis
4.3.3 Optimization method and feedback
5 Conclusion
【参考文献】:
期刊论文
[1]Automatic tuning of sparse matrix-vector multiplication on multicore clusters[J]. LI ShiGang,HU ChangJun,ZHANG JunChao,ZHANG YunQuan. Science China(Information Sciences). 2015(09)
本文编号:3262692
本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/3262692.html