基于空间位置信息的三维音频编码技术研究
发布时间:2018-09-08 14:31
【摘要】:现有的空间方位量化格点算法大多以全部声道为基础合成虚拟声源,违背了矢量幅度平移(VBAP)用3个声道形成一个虚拟声源的基本原理。此外未考虑各个声道之间的时间差,会造成音质下降。针对上述问题,设计基于VBAP基本原理的三维音频编解码框架,以3个声道为一组合成虚拟声源及下混信号,编码时增添时间差参数,在解码端提出基于线性方程组求解的虚拟声像重分配方法,获得与原始声道配置一致的重建信号。实验结果表明,该方法生成的三维音频信号,利用MUSHRA标准进行主观测试,平均得分比现有方法高出12分。
[Abstract]:Most of the existing spatial azimuth quantization lattice algorithms synthesize virtual sound sources on the basis of all sound channels, which violates the basic principle of vector amplitude translation (VBAP) to form a virtual sound source from three channels. Moreover, the time difference between each channel is not taken into account, and the sound quality will decrease. Aiming at the above problems, a 3D audio coding and decoding framework based on the basic principle of VBAP is designed. Three channels are combined to form a virtual sound source and a mixed signal, and the time difference parameters are added to the coding process. At the decoding end, a virtual image redistribution method based on linear equations is proposed to obtain the reconstructed signal which is consistent with the original channel configuration. The experimental results show that the three-dimension audio signal generated by this method is tested subjectively by using MUSHRA standard, and the average score is 12 points higher than that of the existing methods.
【作者单位】: 武汉大学国家多媒体软件工程技术研究中心;
【基金】:国家自然科学基金青年基金“三维声场中声源水平定位线索感知特性测量与分析”(61201340);国家自然科学基金重点项目(61231015) 国家“863”计划项目(2015AA016306)
【分类号】:TN912.3
,
本文编号:2230803
[Abstract]:Most of the existing spatial azimuth quantization lattice algorithms synthesize virtual sound sources on the basis of all sound channels, which violates the basic principle of vector amplitude translation (VBAP) to form a virtual sound source from three channels. Moreover, the time difference between each channel is not taken into account, and the sound quality will decrease. Aiming at the above problems, a 3D audio coding and decoding framework based on the basic principle of VBAP is designed. Three channels are combined to form a virtual sound source and a mixed signal, and the time difference parameters are added to the coding process. At the decoding end, a virtual image redistribution method based on linear equations is proposed to obtain the reconstructed signal which is consistent with the original channel configuration. The experimental results show that the three-dimension audio signal generated by this method is tested subjectively by using MUSHRA standard, and the average score is 12 points higher than that of the existing methods.
【作者单位】: 武汉大学国家多媒体软件工程技术研究中心;
【基金】:国家自然科学基金青年基金“三维声场中声源水平定位线索感知特性测量与分析”(61201340);国家自然科学基金重点项目(61231015) 国家“863”计划项目(2015AA016306)
【分类号】:TN912.3
,
本文编号:2230803
本文链接:https://www.wllwen.com/kejilunwen/xinxigongchenglunwen/2230803.html