基于HEVC的帧内预测算法优化与并行化设计

发布时间：2018-06-04 02:44

本文选题：HEVC + 帧内预测　；参考：《西安邮电大学》2016年硕士论文

【摘要】：新一代高效率视频编码标准HEVC (High Efficiency Video Coding)于2010年1月由视频编码联合组JCT-VC首次提出,其核心目的是在H.264/AVC的基础上,将压缩效率提高一倍。为了达到这个目标,HEVC必须采用更高复杂度的视频编解码算法,因此也引入了极高的计算复杂度。本文在深入研究帧内预测算法的基础上,针对算法中块分割与预测模式选择两个过程,分别给出了两种优化方案：适合于视频质量要求较高场景下的基于率失真优化的快速编码单元划分算法和适合于实时性要求较高场景下的基于模式分组的帧内预测模式快速选择算法,两种算法都有效的降低了计算复杂度,提高了帧内编码的效率。此外,考虑到HEVC中,帧内预测算法的串行操作方式及35种预测模式的排队处理,对预测编码时间和性能的影响,本文基于西安邮电大学自主研发的面向视频编解码的动态可编程可重构的阵列处理器DPR-CODEC,提出了参考像素平滑和预测模式快速选择的并行化方案,有效降低了单个处理单元串行操作所需的数据加载的时钟周期数以及模式预测时所需的总时间,提高了计算效率。具体工作如下：1.HEVC编码单元划分方法的改进：通过对不同深度间的率失真代价进行统计研究,发现不进行分割的CU率失真代价值偏小,而进行分割的CU率失真代价值比较大并且分布比较均匀。基于该特性,本文对不同的量化参数下的率失真代价概率分布图产生的阈值进行统计,得到了最大编码单元(Largest CU,LCU)划分过程中不同深度的阈值方程,并利用该阈值提前终止编码单元的划分,从而达到降低计算复杂度的目的。实验表明改进的方法与HEVC测试模型HM10.0相比,码率增加了0.5%,Y-PSNR降低了0.019dB,编码时间减少了26.7%。2.提出基于模式分组的帧内预测模式快速选择算法：利用候选模式集中排列第一的预测模式与最优预测模式之间的强相关性,本文通过对35种预测模式进行初次筛选和再次筛选,快速精确的找到成为最优预测模式概率最大的候选模式。该方法大大减少了进入RDO过程的模式数量,有效地降低了原有帧内预测编码算法的计算复杂度。实验表明该算法在保证视频质量和码率基本不变的前提下减少了41.8%的编码时间。3.参考像素平滑的并行化设计：HEVC测试模型是针对单处理器系统设计的,其参考像素平滑以串行方式执行时,像素点的滤波运算会受到彼此数据加载的影响,导致处理器无法快速处理。因此,本文给出了一种将所有相关像素点一次性加载完毕,然后再统一进行滤波计算的思路,完成了参考像素平滑的并行化设计。经过仿真验证,该方案串/并行加速比达到14.43。4.预测模式快速选择算法并行化设计：考虑到DPR-CODEC的资源限制以及计算效率,在进行帧内预测时,根据预测方向与图像强纹理方向的相关性,筛选出预测方向出现概率较大的模式进行预测。其并行化思路是：每一个簇同时对预测块的16个像素点进行预测运算,每个PE独立完成12种模式计算。该方案解决了串行模式下,预测模式计算相互等待的问题,实现了多个像素点对预测模式选择的并行处理。仿真结果表明,模式预测并行化设计方案串/并行加速比达到7.60,提高了运算效率。
[Abstract]:The new generation of high efficiency video coding standard HEVC (High Efficiency Video Coding) was first proposed by the video coding joint group JCT-VC in January 2010. Its core aim is to double the compression efficiency on the basis of H.264/AVC. In order to achieve this goal, HEVC must adopt a higher complexity video codec algorithm, so it is also introduced. In this paper, based on the in-depth study of intra prediction algorithm, this paper gives two optimization schemes for the two processes of block segmentation and prediction mode selection in the algorithm, which are suitable for fast coding unit partition algorithm based on rate distortion optimization and suitable for real-time requirements under high video quality requirements. The fast selection algorithm of intra prediction mode based on pattern grouping in higher scene, the two algorithms all effectively reduce the computational complexity and improve the efficiency of intra coding. In addition, the influence of the serial operation mode of intra prediction algorithm and the queue processing of the 35 prediction modes in HEVC is considered, and the effect of the prediction coding time and performance is discussed. Based on the dynamic programmable and reconfigurable array processor DPR-CODEC for video codec, based on the Xi'an University of post and telecommunications, this paper proposes a parallel scheme for the rapid selection of reference pixel smoothing and prediction mode, which effectively reduces the number of clock cycles required for data loading in a single processing unit and when the mode is predicted. The total time needed to improve the computing efficiency. The concrete work is as follows: the improvement of the 1.HEVC coding unit division method: through the statistical study of the rate distortion cost between different depths, it is found that the CU rate distortion of the non segmented ratio is smaller, and the CU rate distortion cost of the segmentation is larger and the distribution is more uniform. In this paper, the threshold value generated by the probability distribution graph of rate distortion cost under different quantized parameters is counted, and the threshold equation of the different depth in the Largest CU, LCU is obtained, and the division of the coding unit is terminated in advance by using the threshold. The experiment shows that the improvement is improved. Compared with the HEVC test model HM10.0, the code rate increased by 0.5%, the Y-PSNR reduced the 0.019dB, and the encoding time reduced the fast selection algorithm of the intra prediction mode based on the pattern packet based on 26.7%.2.: the strong correlation between the first prediction mode and the optimal prediction mode was arranged by the candidate pattern centralization, and the article through the analysis of the prediction model. The first selection and re screening of the model can quickly and accurately find the best probability model of the optimal prediction model. This method greatly reduces the number of modes entering the RDO process and effectively reduces the computational complexity of the original intra prediction coding algorithm. The experiment shows that the algorithm ensures that the video quality and the bit rate are basically the same. Under the premise of the reduction of 41.8% encoding time.3. reference pixel smooth parallelization design: the HEVC test model is designed for the single processor system. When the reference pixel is executed in a smooth and serial manner, the filtering operation of the pixels will be influenced by the data loading of each other, resulting in the processing of the processor. A kind of idea that all relevant pixels are loaded at one time and then reunified the idea of filtering calculation, complete the parallelization design of reference pixel smoothing. After simulation, the scheme series / parallel acceleration ratio achieves the parallel design of the fast selection algorithm of 14.43.4. prediction mode: taking into account the resource constraints and computational efficiency of DPR-CODEC Rate, in the prediction of intra frame, according to the correlation between the prediction direction and the strong texture direction of the image, the model of the larger probability of the prediction direction is screened out. The parallelization idea is that each cluster performs the prediction operation on 16 pixels of the prediction block at the same time, each PE completes 12 modes calculation alone. The scheme solves the serial number. In the model, the prediction model calculates the problem of mutual waiting, and realizes the parallel processing of multiple pixels to the prediction mode selection. The simulation results show that the serial / parallel acceleration ratio of the pattern prediction parallel design scheme reaches 7.60, and the operation efficiency is improved.
【学位授予单位】：西安邮电大学
【学位级别】：硕士
【学位授予年份】：2016
【分类号】：TN919.81

【相似文献】