当前位置:主页 > 科技论文 > 网络通信论文 >

误差恢复视频压缩中的高级可伸缩编码和运动估计

发布时间:2019-06-19 03:29
【摘要】:我们目前正处于一个信息化高度发展的时代,在日常生活中会遇到大量的多媒体内容数据,特别是通过网络进行传输的图片和视频信息。在互联网和无线网络上富媒体的需求止在快速的增长,驱动这些富媒体通信和娱乐服务,不仅需要增强的宽带接入,也需要有力的媒体编码技术,使传输更加有效。一些视频编码标准,例如ISO/IEC MPEG系列和ITU-T视频编码标准,已经开发成功,可以显着地降低数据速率。大部分这些视频压缩方法使用基于块的带有运动补偿的离散余弦变换(DCT:Discrete Cosine Transform)来消除空间和时间冗余。 在针对网络传输所设计的视频编码技术中,两个主要问题比较突出:第一个是任何网络系统的性能都希望最佳地输送数据,但并不能保证网络的可靠性。视频数据,相比与其它数据类型,具有更大的数据量,因而网络有限的传输带宽、低的处理器功耗和可用的存储空间可能限制它的传播能力。针对视频应用,高的传输差错带来了附加的成水,例如时延、复杂度和品质。重传是解决网络传输差错一个有效的方式,但它引入了网络附加的负载,可能不适合要求低时延的应用。其主要的目的是保护视频数据,以及在可能的错误中隐藏或恢复视频数据。在大区域网络中异构性是另一个限制视频应用的问题。不同类型的网络有不同的带宽和流量负载。异构视频网络要求提供具有可变品质的视频服务,并且能够自动准确地满足这些需求。 视频压缩中最关键的部分是运动估计。运动估计是产生运动矢量的过程。这些矢量决定了从前一帧中生成的用来补偿预测帧的运动参数。它的计算量对算法的实时实现提出了很大的挑战。运动估计算法可以分为时域算法和频域算法。匹配算法和基于梯度的算法是时域算法的重要部分。匹配算法可以分为块匹配算法和特征匹配算法。基于梯度的算法可以分为像素递归和块递归方法。频域算法则应用相位相关、小波域匹配和DCT域匹配的方法。 梯度技术通常用于对图像序列的分析。像素递归技术,作为梯度技术的一个子集,应用在图像序列编码中,其中最佳匹配搜索在基于逐像素基础上进行。基于像素的技术要求非常高的计算复杂度,不适合实时应用。频域技术则是依赖与移位图像传输系数之间的关系,没有广泛的应用在图像序列编码中。最终,块匹配技术,其基于最小化特定的代价函数思想,成为编码应用中最广泛使用的方法,它的搜索是在n×n的像素块上进行的。 在各种运动估计算法中,块匹配运动估计是最主要的方法。为了最小化块匹配中的搜索时间,一个简单有效的算法是非常关键的。块匹配运动估计(BMME:Block Matching Motion Estimation)是视频编码中最流行和最实际的运动估计方法。H.26X标准系列和MPEG标准系列均使用BMME方法。块匹配是一个相关技术,它寻找当前图像块和参考帧中特定区域的候选图像块间的最佳匹配。块匹配过程至少用到两帧图片,即参考帧和当前帧。当前帧被分解为各个宏模块,运动估计在每个宏模块上单独进行。一个运动估计算法针对当前帧中将要进行编码的宏模块找出在参考帧上最匹配的宏模块。一旦找到最佳匹配的宏模块,最佳匹配的宏模块和当前的宏模块之间的差异或预测误差就被计算,进而进行DCT变换、量化和游程编码。除了编码不同宏模块之间的差异外,两个宏模块之间的相对位移矢量也将被编码。 在本论文中,我们首先讨论各种基于块的快速运动估计算法,通过实验在搜索速度和计算复杂性方面对这些算法进行评估。进一步将对性能最好的算法进行仔细的分析。这些算法包括穷举搜索或全搜索(FS:Full Search),三步搜索(TSS:Three Step Search),新三步搜索(NTSS:New Three Step Search),四步搜索法(4SS:Four Step Search),菱形搜索(DS:Diamond Search)和自适应十字模式搜索(ARPS:Adaptive Rood Pattern Search)。 其次论文提出了ARPS的新的动态自适应十字搜索算法。它利用了邻块之间的空域相关性,因此我们用ARPS_S来命名,以与ARPS区分。ARPS_S是基于如下的假设:运动矢量的分布不仅与预测的运动矢量高度相关,而且在垂直和水平方向都有高度的相关性,这构成了一个十字阵形。我们所感兴趣的模块周围的模块,其MV的最大值和最小值可以认为是预测MV的估计偏差,这样,他们可以用作臂长的精确估计,从而表示相应方向上的运动动态范围。与ARPS相反,在ARPS_S中四条臂长并不相等。ARPS的初始搜索点数为5,而ARPS_S的初始搜索点数为6。在我们的实验中ARPS_S在搜索速度和视频品质上都比ARPS要优。 最后本论文将讨论使用可仲缩编码策略的差错恢复编码技术。可伸缩的视频编解码技术指的是用户把一个视频序列编码为一个若干个比特流,从而支持译码端各种品质级别。本文将介绍和评估两类可伸缩差错恢复编码技术:分层的编解码(LC:Layered Coding)和多描述编解码(MDC:Multiple Descriptions Coding) 压缩视频比特流的特性使得视频差恢复技术具有很大的重要性。例如,在VLC编码视频数据中单一比特的误差可能导致编码器和译码器之间同步的丢失,进一步导致多个视频块的丢失。多个比特误差,其经常发生在突发信道差错或是包丢失情况下,可能导致部分或整个视频帧的丢失,引起时域维度的误差传播。而这个传播是在减少视频时间冗余度时使用运动补偿技术的直接结果。差错恢复和可伸缩性是视频传输过程中极其重要的两个特征。可伸缩的视频编解码技术指的是用户把一个视频序列编码为一个若干个比特流,从而支持译码端各种品质级别。可伸缩性为在某些可接受的信息损失的情况下提供了很好的鲁棒性。同时,它不会给解码带来太大的问题,也不会严重地影响视觉品质。分层的编解码(LC:Layered Coding)和多描述编解码(MDC:Multiple Descriptions Coding)是视频传输中的两种类型的可伸缩性编码技术。鲁棒的视频编解码技术在限制错误传播和提高视觉品质方面起着极为关键的作用。通过同时设计合理的结果和维持在最小复杂度下的可接受冗余,鲁棒的视频编解码技术可以有效的解决错误隐藏问题。 分层的编码技术把视频序列分成几层,每层对保真度有不同的重要性。最低层也叫做基层,基层可以被独立地编码。基层以上的层次叫做增强层,他们的译码依赖于基层。基层的视频的品质是最低,随着增强层的增加,视频品质将得到提升。在阻塞的情况下,支持分层服务的网络首先传输对于解码最重要的的基层包。分层的视频编码方法最早被提出来用于对抗在ATM网络中的包丢失,提高传输的鲁棒性。随后,这种编码方法被MPEG-2和MPEG-4两个标准组织接受作为一种主要的错误纠正和可伸缩的编码方法。这种分层的编码也被应用于一些IP中多播的应用,例如Internet多播骨干网。 在MDC中整个比特流(描述是同等重要的)。分层编码经常与不均等误差保护(UEP: Unequal Error Protection)相关,进而对传输中最重要的数据,即基层数据,提供了更高的保护性。尽管如此,如果基层发生丢失(如,由于服务器崩溃或是连接失败),或是接收中有大量的错误,那么由于层间的等级性结构,增强层中附加的信息几乎没有用处。MDC技术把视频序列压缩成几个具有相同重要性的比特流。每个比特流(也叫描述)独立解码,而他们之间可以互相增强。当接收器接收到更多的描述时,重建的视频品质更高。因此,并行的可扩展性在多描述编码是天然存在的。本文中的一部分内容就是研究在LC和MDC中如何生成比特流。每一帧首先经过DCT变换,然后被量化和Zigzag编码。在分层的编码中,最重要的DCT系数(前十个系数)被分配给基层,其余的被分配给增强层。在多描述编码中,64DCT系数被等价地分割成奇偶两个部分。仿真结果显示MDC场景要优于LC场景。实验仿真证明,相对于分层编码,如果适当地结合路径多样性或服务器多样性多描述编码技术可以明显的提升实时的视频应用的鲁棒性。在MDC编码中,由于在存在错误的情况下所有接收到的信息都是有用的,这样就避免了尽力而为网络中分层编码的问题,从而在尽力而为的包传输网络中,对于视频传输这种编码方法非常有效。
[Abstract]:At present, we are in an era of high information development, and we will encounter a great deal of multimedia content data in our daily life, especially the pictures and video information to be transmitted through the network. The demand for rich media on the Internet and wireless networks is growing rapidly, driving these rich-media communication and entertainment services, not only for enhanced broadband access, but also strong media coding techniques to make the transmission more efficient. Some video coding standards, such as the iso/ iec mpeg series and the itu-t video coding standard, have been developed to significantly reduce the data rate. Most of these video compression methods use a block-based discrete cosine transform (dct) with motion compensation to eliminate spatial and temporal redundancy. In the video coding technology designed for network transmission, the two main problems are: the first is that the performance of any network system is the best to deliver the data, but it can't guarantee the reliability of the network sex. video data, compared to other data types, have a larger amount of data, so the network's limited transmission bandwidth, low processor power consumption, and available storage space may limit its propagation energy Force. For video applications, high transmission errors bring additional water, such as time delay, complexity, and product quality. Retransmission is an effective way to address network transmission errors, but it introduces a network-attached load that may not be suitable for requiring low latency with. Its main purpose is to protect the video data and to hide or restore the number of videos in the possible errors According to. Heterogeneity in large-area networks is another question of limiting video applications problem. Different types of networks have different bandwidth and flow negative The heterogeneous video network requires the provision of video services with variable quality and is capable of automatically and accurately meeting these requirements Please. The most critical part of video compression is the transport motion estimation. Motion estimation is a motion vector The process. These vectors determine the amount of transport generated in the previous frame to compensate for the predicted frame The real-time implementation of the algorithm is very important to the real-time realization of the algorithm. The motion estimation algorithm can be divided into time domain algorithm and frequency. Domain algorithm. The matching algorithm and the gradient-based algorithm are the weight of the time-domain algorithm. The matching algorithm can be divided into a block matching algorithm and a characteristic piece. The gradient-based algorithm can be divided into pixel recursion and block delivery. the method comprises the following steps of: applying phase correlation, wavelet domain matching and DCT domain matching in a frequency domain algorithm methods. gradient techniques are commonly used for image-to-image processing, The analysis of the sequence. The pixel recursive technique, as a subset of the gradient technique, is applied in the image sequence coding where the best match search is based on pixel-by-pixel on the basis of pixel-based technology requires very high computational complexity, discomfort, In-time application, the frequency-domain technique is the relation between the dependence and the transfer coefficient of the shift image, and it is not widely used in the image finally, the block matching technique, based on the idea of minimizing the particular cost function, becomes the most widely used method in the coding application, block-matched motion estimation in a variety of motion estimation algorithms is the most important method. To minimize the search time in a block match, a simple and effective calculation The block matching motion estimation (BMME) is the most popular and practical in the video coding The motion estimation method of the H.26X standard series and the MPEG standard series the bmme method is used. block matching is a related technique that looks for candidate images of a particular area in the current image block and the reference frame the best match between the blocks. The block matching process uses at least two frame pictures, that is, reference frames and current frames. the current frame is decomposed into individual macro blocks, the motion is estimated at each macro, a motion estimation algorithm finds the macro module to be encoded on the reference frame for the current frame the most matched macro-module, once the best-matched macro-module is found, the difference or the prediction error between the best-matched macro-module and the current macro-module is calculated, and then the DCT transformation is carried out, Quantization and run-length coding. In addition to coding differences between different macro blocks, the relative displacement between the two macro blocks The vector will also be encoded. In this paper, we first discuss various block-based fast motion estimation algorithms, which are based on the search speed and computational complexity. These algorithms are evaluated. The best performance will be The algorithms are carefully analyzed. These algorithms include exhaustive search or full search (FS: Full Search), three-step search (TSS: Three Step Search), new three-step search (NTSS: New Three Step Search), four-step search (4SS: Four Step Search), diamond search (DS: Diamond Search), and adaptive cross-mode search (ARPS: Adaptive Good Patte) (r n Search). Secondly, we put forward the new ARPS The dynamic adaptive cross search algorithm. It uses the spatial correlation between the adjacent blocks, so we use ARPS _ S The ARPS _ S is based on the assumption that the distribution of the motion vector is not only related to the predicted motion vector height, but also has a high degree of correlation in both the vertical and horizontal directions This constitutes a cross-form. The module around the module of interest, the maximum and minimum of the MV, can be considered to be the estimated deviation of the predicted MV, so that they can be used as an accurate estimate of the length of the arm, indicating the phase The dynamic range of motion in the direction. In contrast to ARPS, in ARPS The four arms in the _ S are not equal. The initial search point for ARPS is 5, and ARPS The number of initial search points for _ S is 6. In our lab, ARPS _ S is searching for speed and video The quality is better than the ARPS. In the end, the paper will discuss the use of the scalable the scalable video coding and decoding technique refers to a user encoding a video sequence into a plurality of bit streams, so as to support the various quality levels of the decoding end. The two types of scalable error recovery coding techniques are described and evaluated in this paper: layered coding and decoding (LC: Layered Coding) and multi-description codec (MDC: Multiple Descr) the properties of the compressed video bitstream are such that video difference recovery techniques have a great importance. for example, the error of a single bit in the vlc encoded video data may result in a loss of synchronization between the encoder and the decoder, a loss of a plurality of video blocks is further caused by the loss of a plurality of bit errors, which often occur in the case of a burst channel error or packet loss, which may result in partial or full video frames, The loss of the time-domain dimension is caused by the loss of the time-domain dimension. The direct result of using motion compensation techniques when using motion compensation techniques. Error recovery and scalability are apparent The scalable video coding and decoding technique refers to the fact that the user encodes a video sequence into a number of bits The stream, thus supporting the various quality levels of the decoding end. The scalability is in some acceptable information A good robustness is provided in the event of a loss. At the same time, it does not bring too much to the decoding The problem does not seriously affect the visual quality. The layered codec (LC: Layered Coding) and the multi-description codec (MDC: Multiple Descriptions Coding) are video transmission Two types of scalable coding techniques. Robust video coding and decoding techniques are limiting the propagation and enhancement of errors It plays an important role in the visual quality. The robust video coding and decoding can be achieved by simultaneously designing a reasonable result and maintaining the acceptable redundancy at the minimum complexity The invention can effectively solve the problem of error concealment, In several layers, each layer has a different importance to fidelity. The layer is also called a base layer and the base layer may be independently encoded. sometimes called the enhancement layer, their decoding depends on the base layer. The quality of the video at the base layer is the lowest, with the quality of the video will be improved with the enhancement of the enhancement layer. In the case of congestion, the network that supports the layered service the network first transmits a base layer packet for decoding the most important base layer packet. the layered video encoding method is first proposed to be used to combat the at least one of the at least one of the at least one of the at least one of The packet loss in the m network is lost and the robustness of the transmission is improved. the main error correction and the scalable coding method. this layered coding is also applied to the multicast in some ip With, for example, the internet multicast backbone. The entire bit stream in the MDC (description is equally important). The layered coding is often associated with unequal error protection (UEP: Unfair Error Protection), which in turn is the most important in the transmission the data, that is, the base layer data, provides a higher degree of protection. Nevertheless, if the base layer is lost (e.g., due to a server crash or a connection failure) or a large number of errors are received, The additional information in the enhancement layer is hardly useful in the nature of the structure. The frequency sequence is compressed into several bit streams with the same importance. Each bit stream (also called Description) Independent decoding, and they can be enhanced with each other. When the receiver the reconstructed video quality is higher when more description is received. thus, The parallel scalability is naturally occurring in multi-description coding. A part of this article This is how to generate a bit stream in the LC and MDC. Each frame first passes through D ct transforms and then quantized and zag-coded. in the layered coding, the most important dct coefficients (the first ten systems the number) is assigned to the base layer and the remaining allocated to the enhancement layer. hi the multi-description encoding,6 the 4 dct coefficients are equally divided into odd and even two parts, The simulation results show that the MDC scene is better than the LC scene. the method can obviously improve the robustness of the real-time video application. in the mdc coding, since all the received information is useful in the case of an error, the problem of layered coding in the best-effort network is avoided, so that the best-effort packet transmission network
【学位授予单位】:北京邮电大学
【学位级别】:博士
【学位授予年份】:2014
【分类号】:TN919.81

【相似文献】

相关期刊论文 前10条

1 李应兴;;基于子块运动估计补偿的视频误码块掩饰[J];微计算机信息;2006年36期

2 冯峗;方宗德;金晟毅;;基于统计学理论的参数模型运动估计方法[J];计算机工程与应用;2007年09期

3 戴卫恒,程宏煌,姚u&u&;一种基于云模型的运动估计快速算法[J];电视技术;2001年09期

4 洪波,余松煜;基于对象的菱形搜索运动估计方法[J];数据采集与处理;2001年01期

5 杨晓辉,李中科,吴乐南;模型基辅助编码中实时运动估计的自适应方法[J];信号处理;2002年06期

6 娄东升;一种新的运动估计与运动补偿算法[J];北京广播学院学报(自然科学版);2003年02期

7 陈良琴,陈新;基于提升小波变换域运动估计的序列图像压缩方法[J];陕西科技大学学报;2004年06期

8 齐兵;王群生;杨春玲;;一种运动估计快速算法的研究与实现[J];通信技术;2006年S1期

9 李志欣;李建华;侯建党;;一种改进的运动估计新算法[J];计算机工程与应用;2007年18期

10 刘彦辉;贾俊玲;张颜艳;;一种自适应的六边形-方形运动估计搜索算法[J];广东通信技术;2009年07期

相关会议论文 前10条

1 周露平;陈宗海;王海波;;运动估计中的不确定性分析[A];2007系统仿真技术及其应用学术会议论文集[C];2007年

2 孙明利;吴一全;;基于改进的粒子群算法的块匹配运动估计方法[A];2008通信理论与技术新发展——第十三届全国青年通信学术会议论文集(下)[C];2008年

3 邹晓春;冯燕;赵歆波;;一种快速的块匹配运动估计新算法[A];中国航空学会信号与信息处理专业全国第八届学术会议论文集[C];2004年

4 郭翌;汪源源;侯涛;;基于运动估计和非局部平均的超声心动图滤波[A];中国仪器仪表学会第十二届青年学术会议论文集[C];2010年

5 欧阳国胜;罗永伦;;一种用于视频编码运动估计的新算法[A];2006中国西部青年通信学术会议论文集[C];2006年

6 邹晓春;赵歆波;冯燕;;图像序列分析综述[A];信号与信息处理技术第三届信号与信息处理全国联合学术会议论文集[C];2004年

7 魏津瑜;孙静静;李欣;代中华;;基于运动估计的动态夜视图像的上色算法[A];2011年中国智能自动化学术会议论文集(第一分册)[C];2011年

8 高韬;于明;;基于冗余小波变换的运动估计及DSP实现[A];第十三届全国图象图形学学术会议论文集[C];2006年

9 李振亚;宋建斌;李波;;一种采用混合搜索模式的H.264运动估计快速算法[A];第四届和谐人机环境联合学术会议论文集[C];2008年

10 鲁小兵;肖创柏;;H.264运动估计搜索窗口的动态调整算法[A];图像图形技术研究与应用2009——第四届图像图形技术与应用学术会议论文集[C];2009年

相关重要报纸文章 前1条

1 田力;准确“锁定”交通肇事车辆[N];人民公安报;2010年

相关博士学位论文 前10条

1 陈运必;高性能运动估计的架构设计与优化的研究[D];中国科学技术大学;2011年

2 纪中伟;先进的运动估计与运动补偿算法在数字视频处理中的应用[D];电子科技大学;2002年

3 王镇道;视频压缩的运动估计与小波方法研究[D];湖南大学;2008年

4 魏伟;视频压缩编码的运动估计与补偿技术[D];天津大学;2008年

5 许晓中;视频编码标准中运动估计技术研究[D];清华大学;2009年

6 于雪松;基于单目无标记点的人体3D运动估计关键技术的研究[D];哈尔滨工业大学;2009年

7 刘新春;面向MPEG-4的视频分割算法研究[D];中国科学院电子学研究所;2000年

8 朱向军;视频运动对象分割与先进运动估计/运动补偿算法之研究[D];浙江大学;2006年

9 向东;基于H.264框架的运动估计和变换研究[D];华中科技大学;2006年

10 郑兆青;用于H.264视频编码的运动估计VLSI结构研究[D];华中科技大学;2007年

相关硕士学位论文 前10条

1 邹晓春;基于快速块匹配的图象序列运动估计技术研究[D];西北工业大学;2005年

2 吴庆伟;运动估计方法研究与序列图像的相关性分析[D];华中科技大学;2005年

3 叶学兵;视频压缩中运动估计的研究[D];北京化工大学;2005年

4 魏伟;基于可变形块匹配的运动估计与补偿[D];天津大学;2006年

5 田胜军;基于块匹配算法的运动估计研究[D];电子科技大学;2006年

6 王平;基于粒子群的视频运动估计算法研究与优化[D];电子科技大学;2009年

7 陈良琴;视频压缩系统运动估计技术研究[D];福州大学;2005年

8 张益林;运动估计匹配标准的抗噪声研究[D];上海交通大学;2009年

9 龚源泉;视频运动估计与噪声抑制滤波部件的设计[D];浙江大学;2005年

10 丁锐;用于运动估计的高效三步法的硬件设计与仿真[D];湖南大学;2006年



本文编号:2502041

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/wltx/2502041.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户c2c8e***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com