65nm高性能SRAM体系架构及电路实现
发布时间:2018-08-12 14:17
【摘要】:嵌入式SRAM作为SoC芯片的重要组成部分,其性能决定了高性能SoC整体性能的提升。近年来,虽然得益于集成电路设计方法,EDA技术及集成电路制造工艺的发展,嵌入式SRAM在速度,密度及功耗等性能指标上得到了很大程度的提升,但是微处理器的处理速度的提升高于SRAM性能提升速度,因此LSRAM性能的进一步提高仍然是高性能SoC的迫切需求。 本论文基于国家核高基重大专项《嵌入式CPU SRAM编译器关键技术研究》在SMIC65nm艺下实现了一款16Kb高性能SRAM设计。为满足在1.2V,典型工艺角,室温下读出延时(Tcq)小于800ps,面积小于28826.512um2的设计指标,本论文从SRAM整体架构设计,高性能译码电路设计,精确时序电路产生,面积优化等多方面进行了优化设计 首先,本论文对现有的SRAM架构设计方法的优缺点,适用条件做了详细的分析。在分析的基础上,根据本论文中16KbSRAM的特点,选择存储阵列划分的架构设计方法来实现该16KbSRAM。为选择最优的阵列划分方法,文中对两种划分方法进行仿真验证,比较其性能及实现面积,选择了其中一种最优的SRAM架构实现方法;其次,考虑到精确的SRAM时序产生电路设计能有效的提高SRAM的整体工作速度,降低功耗,本论文对靖确的SRAM时序产生电路进行了详细而深入的分析。早期采用反相器链来实现时序控制的方式存在反相器延时不能有效跟随存储单元读操作放电延时的问题,而且在深亚微米工艺下,工艺偏差增大,这种问题越来越突出。为解决反相器链时序产生电路的缺陷,电容比及电流比复制位线技术被提出,这两种复制位线技术采用冗余的复制列及复制单元来模拟存储单元的读操作以产生SRAM控制信号。电容比及电流比复制位线技术中复制列的单元与存储阵列单元一致,保证复制列的寄生电容与存储阵列的位线寄生电容一致,复制单元读操作电流与当前读操作单元电流一致,因此能准确的跟随SRAM读操作放电延时。上述两种技术只能保证在固定电源电压下时序信号的精确产生,当SRAM工作在某一电压范围内时,采用电容比及电流比技术实现时序控制时出现随电源电压变化,位线放电延时增加,降低了SRAM性能的问题。本论文针对工作在一定电压范围内的SRAM,创造性的提出一种可编程复制位线技术保证SRAM在所有工作电压下均能精确产生时序信号,仿真与测试结果均显示本文中提出的可编程复制位线技术很好地提升了SRAM性能;再次,本论文通过对现有译码电路结构形式及特点进行了分析比较,选择全静态译码逻辑来实现本论文中16Kb SRAM。在对译码电路中晶体管进行尺寸设定时,采用逻辑努力分析方法,确定在65nm工艺下获得最优延时的逻辑门的扇出值。考虑到65nm工艺下,线延时已经能够与逻辑门延时相比拟,特别是在SRAM中从预译码到二级译码需经过很长互连线的情况.本论文讨论了采用包含互连线延时的逻辑路径设计方法,并最终实现了本论文中的高速译码电路。 本论文实现的16Kb SRAM在典型电压下后仿读出延时为540ps,满足了设计指标。在SMIC65nm工艺下的流片测试结果表明该16Kb SRAM能工作在0.8V-1.4V电源电压范围,工作频率范围为440MHz-1.62GHz。在1.2V典型电源电压、室温条件下,SRAM工作速度达到1.22GHz,面积为22762.76μm2远小于设计要求的28826.512um2。为验证论文中可编程复制位线技术的有效性,本论文对采用新技术及电流比复制位线技术实现的SRAM进行比较,结果表明随电源电压变化采用新技术的SRAM的最高工作频率比电流比复制位线技术提升了4.3%-9.5%。
[Abstract]:Embedded SRAM is an important part of SoC chip, its performance determines the overall performance of high-performance SoC. In recent years, although thanks to the development of integrated circuit design method, EDA technology and integrated circuit manufacturing technology, embedded SRAM has been greatly improved in speed, density and power consumption, but slightly. The processing speed of the processor is faster than that of SRAM, so further improvement of LSRAM performance is still an urgent need of high performance SoC.
In order to meet the requirements of 1.2V, typical process angle, room temperature read-out delay (Tcq) less than 800ps and area less than 28826.512um2, this paper designs a high performance SRAM with SMIC65nm technology. The design of decoding circuit, the generation of precise sequential circuits, and the optimization of the area are optimized.
Firstly, this paper makes a detailed analysis of the advantages and disadvantages of the existing SRAM architecture design methods and the applicable conditions. On the basis of the analysis, according to the characteristics of the 16Kb SRAM in this paper, the architecture design method of memory array partitioning is selected to realize the 16Kb SRAM. To select the optimal array partitioning method, two partitioning methods are simulated in this paper. Verify, compare its performance and implementation area, choose one of the best SRAM architecture implementation method; secondly, considering the accurate design of SRAM sequence generation circuit can effectively improve the overall speed of SRAM and reduce power consumption, this paper makes a detailed and in-depth analysis of Jingqing SRAM sequence generation circuit. There is a problem that the inverter delay can not effectively follow the discharge delay of the storage unit in order to realize the timing control of the inverter chain, and the process deviation is increasing in the deep submicron process, which is becoming more and more prominent. The two replication bit-line technologies use redundant replication columns and replication units to simulate the read operation of the storage unit to generate SRAM control signals. In capacitance ratio and current ratio replication bit-line technology, the units of replication columns are identical with the storage array units, ensuring that the parasitic capacitance of the replication column is identical with the parasitic capacitance of the storage array bit-line, and the replication unit reads. The operating current is the same as the current of the current read-operate unit, so it can accurately follow the SRAM read-operate discharge delay. The above two techniques can only ensure the accurate generation of the timing signal under the fixed power supply voltage. When the SRAM works in a certain voltage range, the capacitance ratio and current ratio technology are used to achieve the timing control when the power supply voltage is changed. In this paper, a programmable duplicate bit-line technology is creatively proposed for SRAM operating in a certain range of voltage to ensure that SRAM can accurately generate timing signals at all operating voltages. The simulation and test results show that the programmable duplicate bit-line proposed in this paper can be used to solve the problem of SRAM performance. The technology improves the performance of SRAM very well. Thirdly, through analyzing and comparing the structure and characteristics of the existing decoding circuit, this paper chooses full-static decoding logic to implement the 16Kb SRAM in this paper. Considering that the line delay can be compared with the gate delay in 65nm process, especially in SRAM where the pre-decoding to the secondary decoding takes a long interconnection, this paper discusses the logical path design method including the interconnection delay, and finally realizes the high-speed decoding in this paper. Code circuit.
The sixteen-kb SRAM realized in this paper has a read-out delay of 540 PS under typical voltage, which meets the design requirements. The results of the chip test under SMIC65nm process show that the sixteen-kb SRAM can operate in the voltage range of 0.8V-1.4V, the frequency range of 440MHz-1.62GHz, the typical power supply voltage of 1.2V, and the SRAM working speed reaches 1.22GH at room temperature. Z, the area is 22762.76 um 2, which is much smaller than 28826.512 um 2. In order to verify the validity of the programmable duplicate bit-line technology, this paper compares the SRAM realized by the new technology and the current-ratio duplicate bit-line technology. The results show that the highest specific frequency and current-ratio duplicate bit of the new technology can be achieved with the change of power supply voltage. Line technology improves 4.3%-9.5%.
【学位授予单位】:安徽大学
【学位级别】:硕士
【学位授予年份】:2012
【分类号】:TP333
本文编号:2179310
[Abstract]:Embedded SRAM is an important part of SoC chip, its performance determines the overall performance of high-performance SoC. In recent years, although thanks to the development of integrated circuit design method, EDA technology and integrated circuit manufacturing technology, embedded SRAM has been greatly improved in speed, density and power consumption, but slightly. The processing speed of the processor is faster than that of SRAM, so further improvement of LSRAM performance is still an urgent need of high performance SoC.
In order to meet the requirements of 1.2V, typical process angle, room temperature read-out delay (Tcq) less than 800ps and area less than 28826.512um2, this paper designs a high performance SRAM with SMIC65nm technology. The design of decoding circuit, the generation of precise sequential circuits, and the optimization of the area are optimized.
Firstly, this paper makes a detailed analysis of the advantages and disadvantages of the existing SRAM architecture design methods and the applicable conditions. On the basis of the analysis, according to the characteristics of the 16Kb SRAM in this paper, the architecture design method of memory array partitioning is selected to realize the 16Kb SRAM. To select the optimal array partitioning method, two partitioning methods are simulated in this paper. Verify, compare its performance and implementation area, choose one of the best SRAM architecture implementation method; secondly, considering the accurate design of SRAM sequence generation circuit can effectively improve the overall speed of SRAM and reduce power consumption, this paper makes a detailed and in-depth analysis of Jingqing SRAM sequence generation circuit. There is a problem that the inverter delay can not effectively follow the discharge delay of the storage unit in order to realize the timing control of the inverter chain, and the process deviation is increasing in the deep submicron process, which is becoming more and more prominent. The two replication bit-line technologies use redundant replication columns and replication units to simulate the read operation of the storage unit to generate SRAM control signals. In capacitance ratio and current ratio replication bit-line technology, the units of replication columns are identical with the storage array units, ensuring that the parasitic capacitance of the replication column is identical with the parasitic capacitance of the storage array bit-line, and the replication unit reads. The operating current is the same as the current of the current read-operate unit, so it can accurately follow the SRAM read-operate discharge delay. The above two techniques can only ensure the accurate generation of the timing signal under the fixed power supply voltage. When the SRAM works in a certain voltage range, the capacitance ratio and current ratio technology are used to achieve the timing control when the power supply voltage is changed. In this paper, a programmable duplicate bit-line technology is creatively proposed for SRAM operating in a certain range of voltage to ensure that SRAM can accurately generate timing signals at all operating voltages. The simulation and test results show that the programmable duplicate bit-line proposed in this paper can be used to solve the problem of SRAM performance. The technology improves the performance of SRAM very well. Thirdly, through analyzing and comparing the structure and characteristics of the existing decoding circuit, this paper chooses full-static decoding logic to implement the 16Kb SRAM in this paper. Considering that the line delay can be compared with the gate delay in 65nm process, especially in SRAM where the pre-decoding to the secondary decoding takes a long interconnection, this paper discusses the logical path design method including the interconnection delay, and finally realizes the high-speed decoding in this paper. Code circuit.
The sixteen-kb SRAM realized in this paper has a read-out delay of 540 PS under typical voltage, which meets the design requirements. The results of the chip test under SMIC65nm process show that the sixteen-kb SRAM can operate in the voltage range of 0.8V-1.4V, the frequency range of 440MHz-1.62GHz, the typical power supply voltage of 1.2V, and the SRAM working speed reaches 1.22GH at room temperature. Z, the area is 22762.76 um 2, which is much smaller than 28826.512 um 2. In order to verify the validity of the programmable duplicate bit-line technology, this paper compares the SRAM realized by the new technology and the current-ratio duplicate bit-line technology. The results show that the highest specific frequency and current-ratio duplicate bit of the new technology can be achieved with the change of power supply voltage. Line technology improves 4.3%-9.5%.
【学位授予单位】:安徽大学
【学位级别】:硕士
【学位授予年份】:2012
【分类号】:TP333
【参考文献】
相关博士学位论文 前1条
1 顾明;嵌入式SRAM性能模型与优化[D];东南大学;2006年
,本文编号:2179310
本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/2179310.html