当前位置:主页 > 科技论文 > 计算机论文 >

面向FT1000微处理器的STREAM并行计算与优化

发布时间:2018-09-12 13:48
【摘要】:STREAM是微处理器上内存性能的基准测试程序,在多核多线程FT1000微处理器上发挥高性能是具有挑战性的研究工作。基于多级Cache结构,优化STREAM四个程序的指令流水线,根据寄存器数,设计了多级循环展开方法,根据指令延迟和Cache行的大小确定数据预取的数目,使用汇编语言编写了优化子程序。基于OpenMP并行环境,设计了STREAM并行程序,优化了局部化数据分配方式。数据测试结果表明,优化后的STREAM的性能比原始串行程序性能提高了19.2%~64.2%。优化后,并行程序的最高访存性能达到8.5GB/s,对比优化前的最高访存性能最大提高了22.7%。
[Abstract]:STREAM is a benchmark program for memory performance testing on microprocessors. It is a challenging task to perform high performance in multi-core multithreaded FT1000 microprocessors. Based on the multilevel Cache structure, the instruction pipeline of the four STREAM programs is optimized. According to the number of registers, a multistage loop expansion method is designed, and the number of data prefetching is determined according to the instruction delay and the size of the Cache row. The optimized subprogram is written in assembly language. Based on the OpenMP parallel environment, the STREAM parallel program is designed, and the localized data allocation method is optimized. The test results show that the performance of the optimized STREAM is better than that of the original serial program. After optimization, the maximum memory access performance of parallel programs reaches 8.5 GB / s, compared with that before optimization, the maximum memory access performance is improved by 22.7GB / s.
【作者单位】: 国防科学技术大学并行与分布处理重点实验室;
【基金】:国家863计划资助项目(2012AA01A301) 国家自然科学基金资助项目(60970033,91430218)
【分类号】:TP332

【相似文献】

相关期刊论文 前10条

1 沈佩瑶;Jack;;享受·感动——本田时韵Stream音响改装[J];音响改装技术;2010年05期

2 ;[J];;年期

3 ;[J];;年期

4 ;[J];;年期

5 ;[J];;年期

6 ;[J];;年期

7 ;[J];;年期

8 ;[J];;年期

9 ;[J];;年期

10 ;[J];;年期

相关重要报纸文章 前1条

1 刘秀明;柯达召开Stream概念型印刷机媒体见面会[N];中国包装报;2008年



本文编号:2239191

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/2239191.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户893ef***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com