流水线型FFT架构资源的分析和实现

发布时间：2018-06-16 08:50

本文选题：SDF + FFT处理器　；参考：《北京理工大学》2014年博士论文

【摘要】：我们生活在一个技术日新月异的时代。这主要是归功于数字系统逐渐缓慢地取代旧的模拟系统并快速的发展进步。如今，，数字系统正面对最苛刻的信号处理应用方面的要求，包含了强约束条件下如时钟频率，吞吐量，功耗，延迟和实时计算方面的要求。为了满足这些强约束条件下的要求，通常需要求助于硬件设备诸如ASIC（特定用途集成电路）或FPGA（现场可编程门阵列），因此这些设备需要在信号处理算法的计算方面达到一个非常高的性能。然而，设计硬件电路并不是一件简单的事。首先硬件方面不同于传统的数学，例如：一个算法的数学简化并不一定会导致一个更简单的电路。其次任何算法存在许多不同的硬件实现。从实现算法的计算所使用每一个存储器和一个处理器，到直接执行完成整个的算法的硬件电路都是不唯一的，即把每个乘法转换成一个乘法器，把每个加法转换成一个加法器。第三在硬件编程时，效率始终是一个隐含的要求。信号处理的诸多要求，如时钟频率，延迟或吞吐量必须加以考虑。用于计算一个特定的算法的电路形式主要取决于所需的性能。直接实现流图可能会获得具有高吞吐量的电路，但其面积和功耗会非常高。含有存储器和运算单元的系统会占用更少面积，但它将具有高延迟和低吞吐量。因此，流水线结构往往是首选，因为它们提供了高的信号处理能力以及相当低的硬件要求，此外，效率不仅包括某种架构的选择，而且所选择的类型是理想的设计。硬件实现是希望用于最大限度地提高性能或降低功耗的应用程序，因此该架构必须为了实现这些目标，进行优化。本论文研究了FFT架构在FPGA上最优的实现。SDF架构被认为是一种最优实现，因为它满足大多数通信体系结构的要求。需要特别注意的是如何使设计结果有效地映射到目标FPGA的粗粒度的硬件结构，可以得到更好的实施结果。通过针对Virtex-4和Virtex-6器件映射R2的SDF架构的FFT处理器进行了说明。这种设计FPGA的映射已被详细探讨和研究。可是本文提出了一个更好的映射的转换算法，从而实现的效果，远远超越了先前发表的作品。除此之外，以22次方为基底的不同等价算法进行了仿真，他们具有相同的实现复杂度但是在随后的旋转因子系数间可能有着较少的转换。对进一步的转子的交替进行了对比，以观察对于特定的旋转角度哪种方式有着最少的加法次数。
[Abstract]:We live in an era of rapid technological change. This is largely due to the slow replacement of the old analog system by the digital system and its rapid development. Today, digital systems are facing the most demanding requirements for signal processing applications, including the requirements of strong constraints such as clock frequency, throughput, power consumption, delay and real-time computing. In order to meet the requirements under these strong constraints, Hardware devices such as ASIC (Special purpose Integrated Circuits) or FPGA (Field Programmable Gate Array) are often required to achieve a very high performance in the computation of signal processing algorithms. However, the design of hardware circuits is not a simple matter. First, hardware is different from traditional mathematics. For example, mathematical simplification of an algorithm does not necessarily lead to a simpler circuit. Secondly, there are many different hardware implementations for any algorithm. From every memory and one processor used to implement the calculation of the algorithm, the hardware circuit that directly executes the whole algorithm is not unique, that is, each multiplication is converted into a multiplier, and each addition is converted into an adder. Third, in hardware programming, efficiency is always an implicit requirement. Many requirements for signal processing, such as clock frequency, delay, or throughput, must be considered. The circuit form used to calculate a particular algorithm depends mainly on the desired performance. Direct implementation of flow diagrams may result in high throughput circuits, but their area and power consumption will be very high. Systems with memory and computing units will consume less space, but will have high latency and low throughput. Therefore, pipelined structures are often preferred because they provide high signal processing capabilities and relatively low hardware requirements. In addition, efficiency includes not only the choice of some architecture, but also the type chosen is the ideal design. Hardware implementation is an application that is intended to maximize performance or reduce power consumption, so the architecture must be optimized to achieve these goals. In this paper, the optimal implementation of FFT architecture on FPGA. SDF architecture is considered as an optimal implementation because it meets the requirements of most communication architectures. Special attention should be paid to how to effectively map the design results to the coarse-grained hardware structure of the target FPGA and obtain better implementation results. The FFT processor of SDF architecture which maps R2 to Virtex-4 and Virtex-6 devices is introduced. The mapping of this design FPGA has been discussed and studied in detail. However, a better mapping algorithm is proposed, which is far more effective than previous works. In addition, different equivalent algorithms based on the 22 th power are simulated. They have the same implementation complexity but may have less conversion between the subsequent rotation factor coefficients. Further rotors are compared to see which method has the least number of additions for a particular rotation angle.
【学位授予单位】：北京理工大学
【学位级别】：博士
【学位授予年份】：2014
【分类号】：TN911.7

【相似文献】