基于多层结构的视频编码研究

发布时间：2018-02-11 12:08

本文关键词： 多层编码结构可分层视频编码层间相关性层间帧内模式预测层间运动信息预测随机访问视频内容分析聚类　出处：《浙江大学》2017年博士论文　论文类型：学位论文

【摘要】：多层视频编码结构是指将视频编码为多个视频层,然后利用层间预测消除不同层之间相关性的编码结构。在传统的多层编码结构中,基本层和增强层之间的视频图像是一一对应的,主要应用在可分层视频编码的空间可分层和质量可分层中。而本文提出了一种新型的多层编码结构,基本层由从视频中抽取的少量具有通用信息的图像组成,而增强层是完整的视频序列。这样的编码结构可以使得总的编码效率高于单独编码增强层的效率。换句话说,新型的多层编码结构可以用于更高效的视频编码。本文研究了基于传统多层结构的可分层视频编码和基于新型多层结构的更高效视频编码,并取得了以下创新:1、提出基于知识库的新型多层编码结构在传统的多层编码结构中,基本层图像和增强层图像一一对应。而新型的多层编码结构从对视频内容的分析出发,提取出少量的代表图像构成基本层,再通过层间预测使得总的编码效率高于单独编码增强层的效率。由于基本层的图像在编解码中需要长时间的存储供增强层参考,本文提出了基于知识库的视频编码框架,在该框架下解决了下面两个主要的技术问题:第一、研究了知识库基本层的构造问题并提出了两种构造方法。第一种,提取视频各场景的场景切换图像和随机访问图像作为关键图像,通过聚类的方法剔除关键图像中属于重复场景的图像,从而导出知识库图像,该方法在保证基本层的编码码率尽量小的情况下和待编码视频有尽量大的相关性,有利于促进视频编码的效率。第二种,对各场景的场景切换图像基于SIFT进行重复场景检测和剔除,形成基本层。然后在每个场景内部根据累积的内容变化再补充选择新的知识库图像。该方法可以和视频编码同步进行,适用于实时编码应用。第二、提出了使用知识库图像的编码方法。为了提高视频编码的效率并保证随机访问的功能,知识库图像按照全帧内的模式进行编码,知识库在随机访问点不会清空。在编解码过程中,以随机访问片段为单位参考最相似的知识库图像进行编码。最相似的知识库图像以颜色直方图差为相似度准则查找,简单高效且能保证知识库图像的预测效率。2、传统多层编码结构下的编码方法研究本文研究了传统多层编码结构下的层间预测技术,主要包括层间帧内模式预测和层间运动信息预测,使得层间的帧内模式和运动信息相关性能够得到充分的利用,从而提高增强层的编码效率和降低增强层的编码复杂度。
[Abstract]:Multi-layer video coding structure refers to the coding structure in which the video is encoded into multiple video layers, and then interlayer prediction is used to eliminate the correlation between different layers. The video images between the basic layer and the enhancement layer are one-to-one correspondence, which are mainly used in layered space and quality-layered video coding. In this paper, a new multi-layer coding structure is proposed. The basic layer consists of a small number of images with common information extracted from the video, while the enhancement layer is a complete video sequence. Such a coding structure can make the overall coding efficiency higher than that of the individual coding enhancement layer. New multilayer coding architecture can be used for more efficient video coding. In this paper, layered video coding based on traditional multi-layer structure and more efficient video coding based on new multi-layer structure are studied. The following innovations are obtained: 1. A new multi-layer coding structure based on knowledge base is proposed. In the traditional multi-layer coding structure, the basic layer image and the enhancement layer image correspond one-to-one. However, the new multi-layer coding structure starts from the analysis of video content. A small number of representative images are extracted to form the basic layer, and then the overall coding efficiency is higher than that of the single coding enhancement layer through interlayer prediction. In this paper, a video coding framework based on knowledge base is proposed. The following two main technical problems are solved under this framework: first, the construction of the basic layer of knowledge base is studied and two construction methods are proposed. The scene switching images and random access images of each scene of the video are extracted as the key images, and the images belonging to the repeated scenes in the key images are eliminated by clustering method, and the knowledge base images are derived. This method has the greatest correlation with the video to be encoded under the condition that the coding rate of the basic layer is as small as possible, which is helpful to promote the efficiency of video coding. The scene switching images of each scene are detected and culled by repeated scenes based on SIFT. A basic layer is formed. Then a new knowledge base image is added and selected within each scene according to the accumulated content changes. The method can be synchronized with video coding and is suitable for real-time coding applications. Second, In order to improve the efficiency of video coding and guarantee the function of random access, the knowledge base image is encoded according to the mode of the whole frame. The knowledge base will not be emptied at random access points. In the process of coding and decoding, the most similar knowledge base images are encoded by reference to the most similar knowledge base images in the unit of random access fragments. The most similar knowledge base images are found by using the color histogram difference as the similarity criterion. It is simple and efficient and can guarantee the prediction efficiency of knowledge base image. The traditional coding method under multi-layer coding structure is studied. In this paper, the interlayer prediction technology based on traditional multi-layer coding structure is studied. It mainly includes intra-layer mode prediction and inter-layer motion information prediction, which can make full use of the inter-layer mode and motion information correlation, thus improving the coding efficiency of the enhancement layer and reducing the coding complexity of the enhancement layer
【学位授予单位】：浙江大学
【学位级别】：博士
【学位授予年份】：2017
【分类号】：TN919.81

【相似文献】