余数系统中算法单元及关键技术研究
发布时间:2018-03-18 13:05
本文选题:余数系统 切入点:无损动态扩展方法 出处:《电子科技大学》2017年硕士论文 论文类型:学位论文
【摘要】:和传统的使用硬件规模带来的并行性的方式不同,RNS是一种使用并行的数值表征系统达到天然并行的性的方法。近些年来以RSA、ECC为代表的公钥算法广泛使用了 RNS架构以提升加密的速度。随着秘钥长度的增加,对于RNS下算法单元尤其是模2~n-2~p -1下乘法单元的位宽和性能提出了更高的要求。另外后向转换作为RNS架构下的难点,一直是RNS架构的性能瓶颈之一,{2~n-1、2~n+3、2~n+1、2~n-3}下快速后向转换算法被提出后,由于模2~n+3下乘法单元的性能问题而少有应用。本文结合余数系统在密码学应用和快速后向转换等方向的最新进展,针对相应的特殊余数基下的算法单元进行了深入的研究。通过对大量模2~n±1乘法器相关参考文献进行分类研究,对适合本文设计要求的架构进行了筛选。本文根据模修正在二进制乘法器介入的部位不同将模乘法器分为四种类型,并按照数据流的执行顺序进行了命名,分别为修正1型、修正2型、修正3型以及修正4型。按照本文的分类方式,选择形式为2~n-2~p-1、2~n+3余数基下的算法单元做了如下工作:1.针对2~n -2~p -1类型动态范围利用率较低的问题,使用现有文献中的扩展方法,对模2~n-2~p-1加法运算的相关公式进行推导,降低了其约束条件并且将修正所需的判定项由A+B + T化简为A + B。从而提出了一种适合2~n-2~p-1余数基下算法单元应用的无损动态扩展方法。使用无损动态范围扩展后的模2~n -2~p -1加法器和乘法器较同种结构的算法单元在“面积X时延”效率上分别提高了 41.1%和12.3% 。2.提出了一种修正1型模2~n-2~p-1乘法器,利用模2~n-2~p-1的性质,将包含布斯编码选择项的部分积中可能出现的负数问题进行修正。经过公式推导,将未修正部分积和修正项进行合理的分离,使未修正部分积可以最大程度利用二进制布斯编码部分积生成的相关电路,最终得到了规整的部分积。这种结构较传统的通用型模乘法在平均时延和平均面积上分别减少了 49.1%和6.5%。3.针对修正3型提出了两种不同结构的模2~n-2~p-1乘法器,在布斯结构中通过预先计算解决了将2~n位有符号补码表示的部分积较难修正至n位的问题。进而获得了一个与(n,p)取值组合无关的统一结构。在TDM结构中首次将TDM算法引入到模乘法器的设计当中,推导得到了一种较使用布斯编码更为精简的快速型结构。这两种模2~n -2~p -1乘法器在平均时延上较当前的模2~n- 2~p-1乘法器最高可以降低9.7%。4.针对修正4型对2~n -2~p -1、2~n+3两种余数基分别提出了两种乘法器结构。在模2~n -2~p -1乘法器中,其关键路径上使用一级CSA代替了原来的一个p位二进制进位传播加法器,缩短了关键路径的时延。在模2~n+3乘法器,利用模2~n+3性质在部分积进行压缩时产生“+9”和“-9”进行抵消,节约了额外处理“-9”所需要的硬件资源。降低了关键路径的时延。两种结构较当前的同类型模乘法器在“面积×时延”效率上分别提高了 12.4%和9.0%。在本文的最后,为方便对本文提出的几种算法单元进行设计实现、测试及数据处理。本文构建了一套自动化的处理平台,并给出了 Design Compiler (DC)综合后的数据。
[Abstract]:And the use of traditional hardware from scale parallelism in different ways, RNS is a parallel numerical characterization system to achieve the method of natural parallel. In recent years, with RSA, ECC as the representative of the public key algorithm widely used RNS framework to improve the encryption speed. With the increase of the length of the secret key, for RNS algorithm unit especially put forward higher requirements and performance under -1 mode 2~n-2~p bit multiplication unit. In addition to conversion as the difficulty of RNS architecture, has been one of the bottlenecks of RNS architecture, {2~n-1,2~n+3,2~n+1,2~n-3} fast backward conversion algorithm is proposed, due to performance problems under 2~n+3 mode multiplication unit and fewer the application system in the application of cryptography. The remainder and fast after conversion to the direction of the latest progress in this paper, according to the special remainder based algorithm unit under the studied by. Research on the classification of a large number of 2~n + 1 multiplier related references, for the design requirements of the architecture were screened. Based on the die is involved in different parts of the binary multiplier multiplier is divided into four types, and in accordance with the execution order of data flow in the name, type 1 were modified, modified type 2, type 3 and type 4 correction correction. According to the classification of this paper, the following work has been done to choose form for the 2~n-2~p-1,2~n+3 remainder based algorithm unit: 1. for the problem of low efficiency of using 2~n -2~p -1 type dynamic range, the use of extension methods available in the literature, the related formula of 2~n-2~p-1 addition to die is reducing its constraints and will be required to determine the correction by A+B + T A + B. for simplification and propose a suitable 2~n-2~p-1 based lossless remainder dynamic algorithm application expansion unit Methods. Using lossless algorithm unit dynamic range expansion mode after the 2~n -2~p -1 adder and multiplier structure in the area is the same X delay efficiency were increased by 41.1% and 12.3%,.2. proposed a type 1 model 2~n-2~p-1 multiplier correction, using properties of 2~n-2~p-1 mode, will contain the negative part of booth encoding selection product might appear in the revised. After the deduction of formula, the modified partial product and correction of separation, the modified partial product can maximize the use of binary booth encoding partial product generation related circuit, finally got a regular part of the product. A universal modular multiplication of this structure when compared with the traditional average the average delay and area were reduced by 49.1% and 6.5%.3. for the correction of type 3 proposed two kinds of mode 2~n-2~p-1 multiplier in booth structure by pre computing To solve the 2~n bit'signed complement representation the partial product is difficult to fix to the N problem. And then obtained a (n, P) unified structure corresponds to the combination of independent. For the first time in the TDM structure and the TDM algorithm is introduced to design the multiplier, deduced a cloth. Encoding is more rapid simplified structure. These two kinds of mode 2~n -2~p -1 multiplier in average delay is highest modulo 2~n- 2~p-1 multiplier current can reduce the 9.7%.4. of 2~n -2~p for the modified type 4 -1,2~n+3 two residue base are proposed for two. In the 2~n -2~p mode multiplier -1 multiplier, the critical path. Use a CSA instead of a p bit binary carry propagation adder, shorten the critical path delay. In 2~n+3 multiplier, using 2~n+3 compression properties in the partial product produced "+9" and "-9" are offset, saving The additional "-9" needed hardware resources. Reduce the critical path delay. Two kinds of structure compared with the same type of current mode multiplier in "area X delay" efficiency were increased by 12.4% and 9.0%. in the end of this article, in the design and implementation of several algorithms for unit convenient for this, and test data processing. This paper constructs a set of automated processing platform, and gives the Design Compiler (DC) after comprehensive data.
【学位授予单位】:电子科技大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP332.2
【参考文献】
相关期刊论文 前1条
1 魏少军;;2015年中国集成电路设计业的发展情况[J];集成电路应用;2016年01期
相关博士学位论文 前1条
1 马上;基于余数系统的数字信号处理VLSI实现关键技术研究[D];电子科技大学;2009年
相关硕士学位论文 前4条
1 严海;余数系统中模加和模乘单元设计[D];电子科技大学;2016年
2 李成冬;基于余数系统的RSA加密运算电路的设计[D];华南理工大学;2015年
3 赵英旭;余数系统中余数基构建和数值缩放的研究[D];电子科技大学;2014年
4 周璐;余数系统中模加和模乘单元的设计[D];电子科技大学;2013年
,本文编号:1629696
本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/1629696.html