基于VC的广告语音识别系统的设计研究
发布时间:2018-03-01 15:36
本文关键词: 语音识别 特征提取 线性预测倒谱系数 梅尔倒频谱系数 K均值 动态时间规整 出处:《南京理工大学》2007年硕士论文 论文类型:学位论文
【摘要】: 随着经济的发展,电视广告成为社会生活中越来越重要的一部分,而其带来的社会问题也日渐显著,特别是虚假广告严重误导了消费者,坑害了广大人民,因此广告监测成为社会急需处理的问题。 本课题主要研究对广告语音识别技术的软件实现。基于语音识别的基本原理和过程,介绍了语音端点检测,语音特征提取,语音建模及模型匹配的基本原理和计算方法。 在广告语音端点检测部分,主要介绍了短时能量和短时过零率,并结合仿真结果给出了适合本系统的双门限端点检测法。 在广告语音特征提取部分,主要介绍了语音的倒谱以及常用的线性预测倒谱系数(LPCC),,梅尔倒频谱系数(MFCC)。 在语音建模及匹配部分,为了解决特征参数数据量过大以及广告音频中出现的多帧、丢帧等问题,本课题应用了K均值聚类,矢量量化技术和DTW算法。 通过上述的算法,在MATLAB环境下计算仿真,比较分析了这些算法的特性及参数选取的方法,给出了一种适合本课题的建模与识别方法。 在上述工作的基础上,本课题在VC环境下进行了对指定广告语音的测试实验。通过实验表明,该系统对广告监测有着一定的实用意义。
[Abstract]:With the development of economy, TV advertisement has become an increasingly important part of social life, and the social problems brought by it are becoming more and more obvious, especially the false advertisement has misled consumers seriously and harmed the masses of people. Therefore, advertising monitoring has become a problem urgently needed to be dealt with in society. Based on the basic principle and process of speech recognition, this paper introduces the basic principles and calculation methods of speech endpoint detection, speech feature extraction, speech modeling and model matching. In the part of advertising speech endpoint detection, the short time energy and short time zero crossing rate are mainly introduced, and combined with the simulation results, a double threshold endpoint detection method suitable for this system is presented. In the part of advertising speech feature extraction, the speech cepstrum and the commonly used linear prediction cepstrum coefficients (LPCCX), Mel cepstrum coefficients (MFCCs) are introduced. In the part of speech modeling and matching, the K-means clustering, vector quantization and DTW algorithm are applied in order to solve the problems of excessive data volume of feature parameters and multi-frame and frame loss in advertising audio. Based on the above algorithms, the characteristics of these algorithms and the methods of parameter selection are compared and analyzed under the MATLAB environment. A modeling and identification method suitable for this project is presented. On the basis of the above work, this paper has carried on the test experiment to the designated advertisement voice under the VC environment. The experiment shows that the system has certain practical significance to the advertisement monitoring.
【学位授予单位】:南京理工大学
【学位级别】:硕士
【学位授予年份】:2007
【分类号】:TN912.34
【引证文献】
相关硕士学位论文 前2条
1 亢明;基于矢量量化的语音识别及全文检索研究[D];重庆大学;2009年
2 张涛;基于FPGA的小波提升算法语音去噪系统的设计与实现[D];广西师范大学;2012年
本文编号:1552569
本文链接:https://www.wllwen.com/wenyilunwen/guanggaoshejilunwen/1552569.html