依赖距离主导的向量化方法研究
发布时间:2019-06-06 01:20
【摘要】:向量寄存器的非满载使用方式为大量迭代次数不足的循环提供了向量化的机会,但也导致向量化的并行宽度不固定,传统的向量因子主导的依赖测试方法不再适用。提出了一种依赖距离主导的依赖测试方法,通过分析依赖图中所有依赖环的破环关键边所携带的依赖距离,选择其中最小的依赖距离来决定并行宽度,破除依赖环,实现基于向量寄存器非满载使用方式的向量化。实验结果表明,该方法能够有效增加循环向量化的机会和提高向量寄存器的使用率,测试用例的向量化加速比平均提高14.6%。
[Abstract]:The non-full load usage of vector register provides the opportunity of vector quantification for a large number of cycles with insufficient iterations, but it also leads to the unfixed parallel width of vector, and the traditional vector factor-led dependency test method is no longer applicable. In this paper, a distance-dependent dependency test method is proposed. By analyzing the dependency distance carried by the key edges of all dependency rings in the dependency graph, the minimum dependency distance is selected to determine the parallel width and break the dependency ring. The vector quantification based on the non-full load mode of vector register is realized. The experimental results show that this method can effectively increase the opportunity of cyclic Vectorization and the utilization rate of vector registers, and the Vectorization acceleration ratio of test cases is increased by 14.6% on average.
【作者单位】: 信息工程大学数学工程与先进计算国家重点实验室;
【基金】:“核高基”国家科技重大专项资助项目(2009ZX01036-001-001-2)
【分类号】:TP332.11
本文编号:2493961
[Abstract]:The non-full load usage of vector register provides the opportunity of vector quantification for a large number of cycles with insufficient iterations, but it also leads to the unfixed parallel width of vector, and the traditional vector factor-led dependency test method is no longer applicable. In this paper, a distance-dependent dependency test method is proposed. By analyzing the dependency distance carried by the key edges of all dependency rings in the dependency graph, the minimum dependency distance is selected to determine the parallel width and break the dependency ring. The vector quantification based on the non-full load mode of vector register is realized. The experimental results show that this method can effectively increase the opportunity of cyclic Vectorization and the utilization rate of vector registers, and the Vectorization acceleration ratio of test cases is increased by 14.6% on average.
【作者单位】: 信息工程大学数学工程与先进计算国家重点实验室;
【基金】:“核高基”国家科技重大专项资助项目(2009ZX01036-001-001-2)
【分类号】:TP332.11
【相似文献】
相关期刊论文 前3条
1 索维毅;赵荣彩;姚远;刘鹏;;面向DSP的超字并行指令分析和冗余优化算法[J];计算机应用;2012年12期
2 陈向;沈立;;一种面向自动向量化和数据置换操作的中间表示[J];计算机工程与科学;2012年07期
3 ;[J];;年期
相关会议论文 前1条
1 黄君辉;刘仲;陈跃跃;;一种基于YHFT-Matrix的FFT向量化实现[A];第十五届计算机工程与工艺年会暨第一届微处理器技术论坛论文集(A辑)[C];2011年
相关硕士学位论文 前2条
1 高翔;集成众核平台科学计算应用性能测评和优化研究[D];国防科学技术大学;2014年
2 夏睿杰;基于FT-Matrix2的自动向量化关键技术研究与实现[D];国防科学技术大学;2015年
,本文编号:2493961
本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/2493961.html