基于FPGA的LTE-A系统中的Turbo编译码算法的并行化研究与实现

发布时间：2018-01-22 15:52

本文关键词： LTE-A Turbo 并行编码并行译码 FPGA实现　出处：《电子科技大学》2014年硕士论文　论文类型：学位论文

【摘要】：随着人们对业务层面需求的增加和提高,对移动通信系统的宽带化要求也越来越高,因此,选择一种具有良好译码性能的信道编码以提高传输信息的可靠性就显得非常重要。经过长期方案评估,3GPP选择Turbo码作为第四代移动通信系统——LTE-A系统的编码方案。2013年12月4日,国家工信部向三大运营商发放4G牌照,预示着对LTE-A系统中Turbo码的研究将随之备受关注。本文分析了适用于LTE-A系统的Turbo编码算法和Turbo译码算法。对于编码,可以直接根据LTE-A协议进行设计;对于Turbo译码,本文首先研究了适用于LTE-A系统的高阶软解调算法的基本原理,并分析和比较了四种不同译码算法(SOVA,MAP,LOG-MAP,MAX-LOG-MAP)的原理、性能和复杂性,在牺牲很小性能的前提下,选择MAX-LOG-MAP算法作为硬件实现算法。另外,本文还着重分析了编码算法和译码算法的并行化处理。传统的编码算法利用移位寄存器,任意时刻的寄存器状态都跟之前的输入比特有关,无法实现并行化以提高吞吐量,所以本文设计了一种基于查找表的实现方法解决了这个问题。同时,本文分析了QPP交织器的无地址争用,无访问冲突的特点,在此基础上,分析了译码算法的并行交织和并行解交织并仿真了不同并行度对MAX-LOG-MAP译码性能的影响,由于非常有效的子译码器的初始化策略,并行MAX-LOG-MAP译码算法相对于串行MAX-LOG-MAP译码算法并没有太大的性能损失(对于帧长为40比特的译码,8并行度的译码损失仅为0.7dB左右)。在算法分析的基础上,本文设计了LTE-A系统中的Turbo编码器和Turbo译码器的并行化结构(8并行度)。并详细分析了各个子模块的接口、核心电路、仿真结果以及总体硬件资源消耗情况,最后在Altera DE4(芯片型号:EP4S40G5H40I2)上对所设计的硬件电路进行板级测试。本文所设计的Turbo编码器的资源占有率不到1%(其中组合逻辑单元使用个数为477个,占有率小于1%;寄存器使用个数为762个,占有率小于1%),最高时钟频率可以达到315.06 MHz,最大吞吐量可以达到2.52 Gbit/s;Turbo译码器的资源占有率为15%(其中组合逻辑单元使用个数为47084个,占有率为11%;存储单元使用个数为4826个,占有率为2%;寄存器使用个数为54251个,占有率为13%),最高时钟频率可以达到175.87 MHz,最大吞吐量可以达到175.87Mbit/s。
[Abstract]:With the increase and improvement of people's demand for service level, the requirement of broadband for mobile communication system is becoming more and more high. It is very important to select a channel code with good decoding performance to improve the reliability of transmission information. 3GPP chooses Turbo code as the coding scheme of 4th generation mobile communication system-LTE-A system. In December 4th 2013, the Ministry of Industry and Information Technology issued 4G licenses to three major operators. It indicates that the study of Turbo code in LTE-A system will be paid more attention. This paper analyzes the Turbo coding algorithm and Turbo decoding algorithm suitable for LTE-A system. It can be designed directly according to LTE-A protocol. For Turbo decoding, the basic principle of high order soft demodulation algorithm for LTE-A system is studied, and four different decoding algorithms are analyzed and compared. The principle, performance and complexity of LOG-MAPG MAX-LOG-MAP.On the premise of sacrificing very small performance, MAX-LOG-MAP algorithm is chosen as the hardware implementation algorithm. This paper also analyzes the parallelization of coding algorithm and decoding algorithm. The traditional coding algorithm uses shift register, and the register state at any time is related to the previous input bit. It is impossible to achieve parallelization to improve throughput, so this paper designs an implementation method based on lookup table to solve this problem. At the same time, this paper analyzes the QPP Interleaver without address contention. On the basis of this, the parallel interleaving and parallel deinterleaving of decoding algorithm are analyzed and the influence of different parallelism degree on MAX-LOG-MAP decoding performance is simulated. Because of the very effective initialization strategy of the sub-decoder. The parallel MAX-LOG-MAP decoding algorithm has no significant performance loss compared with the serial MAX-LOG-MAP decoding algorithm. The decoding loss of parallelism is only about 0.7 dB. In this paper, the parallelization structure of Turbo encoder and Turbo decoder in LTE-A system is designed, and the interface and core circuit of each sub-module are analyzed in detail. Simulation results and overall hardware resource consumption. Finally in Altera DE4 (chip type: EP4S40G5H40I2). The Turbo encoder designed in this paper has less than 1 share of resources (the number of combinational logic units is 477. The occupation rate is less than 1%; The number of registers used is 762, the occupancy rate is less than 1 bit, the highest clock frequency can reach 315.06 MHz, and the maximum throughput can reach 2.52 Gbit / s; The resource share of Turbo decoder is 15. Among them, the number of combinational logic units is 47084, and the occupation rate is 11. The number of memory units is 4826, and the occupation rate is 2. The number of registers used is 54251, the occupancy rate is 13, the highest clock frequency can reach 175.87 MHz, and the maximum throughput can reach 175.87 Mbit / s.
【学位授予单位】：电子科技大学
【学位级别】：硕士
【学位授予年份】：2014
【分类号】：TN929.5;TN911.22

【参考文献】