基于Goldschmidt算法的高性能双精度浮点除法器设计

发布时间：2018-04-28 18:42

本文选题：浮点除法器 + Goldschmidt算法　；参考：《计算机应用》2015年07期

【摘要】：针对双精度浮点除法通常运算过程复杂、延时较大这一问题,提出一种基于Goldschmidt算法设计支持IEEE-754标准的高性能双精度浮点除法器方法。首先,分析Goldschmidt算法运算除法的过程以及迭代运算产生的误差;然后,提出了控制误差的方法;其次,采用了较节约面积的双查找表法确定迭代初值,迭代单元采用并行乘法器结构以提高迭代速度;最后,合理划分流水站,控制迭代过程使浮点除法可以流水执行,从而进一步提高除法器运算速率。实验结果表明,在40 nm工艺下,双精度浮点除法器采用14位迭代初值流水结构,其综合cell面积为84 902.261 8μm2,运行频率可达2.2 GHz;相比采用8位迭代初值流水结构运算速度提高了32.73%,面积增加了5.05%;计算一条双精度浮点除法的延迟为12个时钟周期,流水执行时,单条除法平均延迟为3个时钟周期,与其他处理器中基于SRT算法实现的双精度浮点除法器相比,数据吞吐率提高了3~7倍;与其他处理器中基于Goldschmidt算法实现的双精度浮点除法器相比,数据吞吐率提高了2~3倍。
[Abstract]:In order to solve the problem that the operation process of double precision floating-point division is complex and the delay is long, a high performance double-precision floating-point divider based on Goldschmidt algorithm is proposed to support IEEE-754 standard. Firstly, the process of division of Goldschmidt algorithm and the error caused by iterative operation are analyzed. Then, the method of controlling error is proposed. Secondly, the method of double look-up table is used to determine the initial value of iteration. The iteration unit adopts the parallel multiplier structure to improve the iteration speed. Finally, the flow station is divided reasonably, and the floating-point division can be performed by pipeline through controlling the iterative process, which further improves the operation speed of the divider. The experimental results show that the dual-precision floating-point divider uses a 14-bit iterative initial flow structure in the 40nm process. The integrated cell area is 84 902.261 8 渭 m 2, and the operation frequency can reach 2 902.261 GHz. Compared with using 8 bit iterative initial value pipeline structure, the operation speed is increased 32.73 and the area is increased by 5. 05. The delay of calculating a double precision floating point division is 12 clock cycles. The average delay of single division is three clock cycles. Compared with the dual-precision floating-point divider based on SRT algorithm in other processors, the data throughput is increased by 3 ~ 7 times. Compared with the dual-precision floating-point divider based on Goldschmidt algorithm in other processors, the data throughput is increased by 2 times.
【作者单位】：国防科学技术大学计算机学院;
【基金】：湖南省重点学科建设项目(434515000008) 航空科学基金资助项目(2013zc88003) 国家自然科学基金资助项目(61402499)
【分类号】：TP332.22

【参考文献】