基于核函数的语音情感识别技术的研究

发布时间：2018-10-05 19:48

【摘要】：作为情感计算的一个重要分支,情感识别在近年来引起了国内外研究者的广泛关注。语音作为人类交流的重要方式之一,承载着说话人大量的情感信息。语音情感识别技术能够使计算机通过语音信号识别说话人的情感状态,实现更和谐的人机交互,在实际生活中具有非常广阔的应用前景。本文主要研究了基于核函数的语音情感识别,将核方法引入传统的模式识别算法中,进一步提高算法的非线性处理能力,并针对相应的算法提出若干改进应用于语音情感识别中。本论文的主要研究内容和创新点如下：(1)阐述了语音情感识别的研究背景和意义,并总结了情感描述模型、情感数据库、情感特征参数、特征降维及情感分类算法等方面的国内外研究现状。(2)设计并录制汉语语音情感数据库,该库包含高兴、愤怒、悲伤、害怕、平静等五种基本情感下的语音,且全部语音样本都经过有效性检验以确保数据符合规范。对数据库中的语音信号进行预处理工作,并提取出语速、能量和幅度、基频、共振峰、MFCC等参数组成情感特征矢量并分析不同情感状态下参数的变化规律,为后续语音情感实验做好基础工作。(3)提出一种核C均值聚类与核K近邻分类相结合的算法用于语音情感识别中,该算法利用核映射将原输入空间映射到高维特征空问,在特征空间内进行C均值聚类构造代表性的情感模板,再利用K近邻算法对测试样本分类。该算法不仅利用了核方法提高分类器的非线性处理能力,还克服了传统核K近邻分类时需要计算测试样本与所有训练样本间距离的缺点,提高了分类速度。为了进一步提高该算的识别正确率,本文还将模糊集的理论引入该算法中,通过构造模糊聚类得到更优的情感聚类集合并在近邻分类时通过构造隶属度函数使测试样本以不同程度隶属于各个情感类别,得到更加符合实际情况的分类结果。最终实验表明,该算法具有更有效的识别效率。(4)提出将核稀疏表示分类算法应用在语音情感识别中,该算法利用核映射机制将传统稀疏表示分类器推广到核稀疏表示分类器,克服了稀疏表示分类器不能有效解决非线性问题的缺点,使测试样本更准确地表示为训练样本的一个稀疏线性组合。最后利用局部编码的思想对该算法进行改进,提出一种基于局部约束的加权核稀疏表示分类算法,与核稀疏表示分类算法相比,该算法能够使测试样本用更多近邻的训练样本进行稀疏表示,在一定程度上能够提高分类的准确度。(5)对支持向量机中的核函数进行了深入研究并提出改进,为了突出了不同特征对分类作用的差异性,本文将特征重要程度的信息融入多项式核函数和高斯核函数中,然后利用改进后的多项式核函数和高斯核函数组成组合核函数,最后再通过优化算法寻找最优核参数以获得性能最优的组合核函数。该算法不仅对基核函数进行了改进,还利用组合核函数代替单一核函数,并通过优化算法寻找最优核参数及组合参数,可以说对传统支持向量机做了多重改进以提升算法性能。
[Abstract]:As an important branch of emotion calculation, emotion recognition has attracted the attention of researchers at home and abroad in recent years. As one of the important ways of human communication, speech carries a large amount of emotional information. The speech emotion recognition technology enables the computer to recognize the emotional state of the speaker through the voice signal, realize more harmonious human-computer interaction, and has a very wide application prospect in real life. This paper mainly studies the recognition of speech emotion based on kernel function, introduces kernel method into the traditional pattern recognition algorithm, further improves the non-linear processing ability of the algorithm, and puts forward some improvement to the speech emotion recognition according to the corresponding algorithm. The main research contents and innovation points of this thesis are as follows: (1) the research background and significance of speech emotion recognition are expounded, and the domestic and foreign research status of emotion description model, emotion database, emotion characteristic parameter, feature health-reduction and emotion classification algorithm are summarized. (2) Design and record the Chinese voice emotion database, which contains five basic emotions such as happiness, anger, sadness, fear, calm and so on, and all the speech samples pass the validity check to ensure that the data conforms to the specification. The speech signal in the database is pre-processed, and the speech speed, energy and amplitude, fundamental frequency, resonance peak, MFCC and other parameters are extracted to form the emotion characteristic vector and the change rule of parameters in different emotional states is analyzed, and the basic work is done for the subsequent voice emotional experiment. (3) a method for combining core C mean clustering and nuclear K nearest neighbor classification is proposed for speech emotion recognition. The algorithm uses kernel mapping to map the original input space to the high-dimensional feature empty question, and performs C-means clustering in the feature space to construct a representative emotion template. and then classifying the test samples by using a K-nearest algorithm. The algorithm not only improves the nonlinear processing capability of the classifier by using the core method, but also overcomes the defect that the distance between the test sample and all the training samples needs to be calculated in the traditional nuclear K nearest neighbor classification, and the classification speed is improved. In order to further improve the accuracy of the calculation, this paper also introduces the theory of fuzzy sets into the algorithm. By constructing fuzzy polytypes to get better emotion clustering sets and constructing membership functions in the neighborhood classification, the test samples are subordinate to each emotion category in different degrees. and a more realistic classification result is obtained. The final experiment shows that the algorithm has more effective recognition efficiency. (4) applying the kernel sparse representation classification algorithm in speech emotion recognition, using the kernel mapping mechanism to extend the traditional sparse representation classifier to the kernel sparse representation classifier, overcoming the defect that the sparse representation classifier can not effectively solve the non-linear problem, the test samples are more accurately represented as a sparse linear combination of the training samples. At last, using the idea of local coding to improve the algorithm, a weighted kernel sparse representation classification algorithm based on local constraints is proposed. Compared with the kernel sparse representation classification algorithm, the algorithm can make the test samples sparse representation with more neighbor training samples. the accuracy of the classification can be improved to a certain extent. (5) The kernel functions in the support vector machine are deeply researched and improved. In order to highlight the difference of different features on the classification, the information of the feature importance degree is integrated into the polynomial kernel function and the Gaussian kernel function. Then using the improved polynomial kernel function and the Gaussian kernel function to form the combined kernel function, finally finding the optimal kernel parameters by the optimization algorithm to obtain the optimal combination kernel function. The algorithm not only improves the kernel kernel function, but also replaces the single kernel function by using the combination kernel function, and finds the optimal kernel parameter and the combined parameter through the optimization algorithm, and can say that the traditional support vector machine has multiple improvements to improve the performance of the algorithm.
【学位授予单位】：东南大学
【学位级别】：硕士
【学位授予年份】：2015
【分类号】：TN912.34

【相似文献】