基于语音与人脸表情信息的情感识别算法研究
发布时间:2018-02-14 16:04
本文关键词: 语音特征 表情特征 融合算法 支持向量机 参数优化 出处:《华东理工大学》2014年硕士论文 论文类型:学位论文
【摘要】:单模态情感识别由于受到单一模态情感特征的限制,识别率并没有得到较大的提高。近年来,多模态情感识别突破了这一限制,在情感识别过程中,引入了多种模态的情感特征进行融合,从而在识别率上有了较大的提高。 目前,多模态情感识别的方法和思路主要有判决层融合与特征层融合。本文采用特征层融合的方式,提取人脸表情特征和语音情感特征,然后根据两种模态情感特征的特点,进行特征优化处理,最后设计分类器进行情感的分类。本文选择本课题组自建的情感数据库作为课题研究的数据,该数据库包含语音、表情和脑电三种模态的情感数据,情感类别有7种,即生气、厌恶、害怕、高兴、中性、悲伤和惊奇。 本文的主要研究工作有: (1)语音情感特征提取,本文采用不同的语音特征提取方法(14维特征和74维特征),提取包括短时能量、基音频率、第一共振峰、美尔频率倒谱系数(MFCC)和语音持续时间等特征类别,同时计算了这些特征类别相关的统计参数,并以这些特征作为语音情感特征数据用于情感识别。 (2)人脸表情特征提取,本文提出了改进的局部二值模式(LBP)表情特征提取算法主要提取人脸的眼睛和嘴巴两个部位的纹理特征。该算法的目的在于保证表情识别率,同时尽可能地降低特征数据的维数,减少计算量。 (3)语音与表情特征的融合,本文根据语音和表情的情感特征,提出了语音与表情特征的直接融合算法和语音与表情特征的融合优化算法。语音与表情特征的直接融合算法主要解决两种模态特征维数上差异;语音与表情特征的融合优化算法考虑两种模态特征的联系与差异,提出先融合,后利用主成分分析(PCA)方法进行降维优化处理,再进行情感分类。 (4)双模态情感识别,本文采用支持向量机(SVM)算法进行情感识别的仿真实验。该算法对小样本、非线性分类问题具有很强的分类能力。在SVM参数优化问题中,本文提出了改进的网格搜索参数优化算法,该算法基本思想是先通过基本的网格搜索算法进行粗搜,确定参数的范围,然后再在此范围内进行精搜,找到最优识别率的参数组合。仿真实验验证了上述算法的有效性。
[Abstract]:Single modal emotion recognition due to single modal emotion feature constraints, the recognition rate has not been greatly improved. In recent years, multimodal emotion recognition breaks through the limit, in the emotion recognition process, introduces the affective characteristics of multi modality fusion, and the recognition rate is improved greatly.
At present, methods and ideas of multimodal emotion recognition are the main decision level fusion and feature level. This paper uses the feature fusion method, facial expression feature extraction and speech emotion features, and then according to the characteristics of two kinds of modal emotion features, feature classification optimization, finally classifier is designed. This paper chooses this topic of emotion group self emotion database as the research data, the database contains data of three modes of speech, emotion expression and EEG, emotion category has 7 kinds, namely, anger, disgust, fear, happy, neutral, sadness and surprise.
The main research work of this article is as follows:
(1) speech feature extraction, the method for extracting speech features different (14 dimension and 74 dimension), including short-time energy, pitch frequency extraction, first formant, Mel frequency cepstrum coefficient (MFCC) and the duration of speech feature categories, statistical parameters. These features also related categories, and use these features as speech emotion feature data for emotion recognition.
(2) facial expression feature extraction, this paper proposes a local two value model (LBP) improved facial feature extraction texture feature extraction algorithm mainly face the eyes and mouth of two parts. The purpose of the algorithm is to ensure the recognition rate, at the same time as far as possible to reduce the dimension of data, reduce the amount of computation.
(3) the integration of voice and facial expression features, according to the characteristics of speech and emotion expression, proposed fusion algorithm directly based on voice and expression characteristics of the algorithm and speech and expression characteristics. Direct speech and expression feature fusion algorithm mainly solves two modal feature dimension difference; speech and expression feature fusion optimization considering the links and differences between the two algorithms of modal characteristics of the proposed fusion, after using principal component analysis (PCA) to reduce the dimension optimization processing method, then sentiment classification.
(4) bimodal emotion recognition, this paper uses support vector machine (SVM) algorithm simulation of emotion recognition. The algorithm of small sample, nonlinear classification problem has strong ability of classification. In the SVM parameter optimization problems, this paper proposes parameter optimization algorithm of improved grid search algorithm is the basic idea. The basic grid search algorithm for rough search, to determine the range of parameters, and then this range of precision search, find the optimal parameters recognition rate. Simulation results verify the effectiveness of the algorithm.
【学位授予单位】:华东理工大学
【学位级别】:硕士
【学位授予年份】:2014
【分类号】:TN912.34;TP391.41
【参考文献】
相关期刊论文 前10条
1 王玉洁,王志良,陈锋军,王国江,王玉锋;基于隐马尔可夫模型的情感建模[J];北京农学院学报;2005年01期
2 韩一;王国胤;杨勇;;基于MFCC的语音情感识别[J];重庆邮电大学学报(自然科学版);2008年05期
3 余伶俐;蔡自兴;陈明义;;语音信号的情感特征分析与识别研究综述[J];电路与系统学报;2007年04期
4 黄程韦;金峗;王青云;赵力;邹采荣;;基于语音信号与心电信号的多模态情感识别[J];东南大学学报(自然科学版);2010年05期
5 余华,王治平,赵力;语音信号中情感特征的分析和识别[J];电声技术;2004年03期
6 秦宇强;张雪英;;基于SVM的语音信号情感识别[J];电路与系统学报;2012年05期
7 薛为民,石志国,谷学静,王志良;基于Agent的人机情感交互系统研究[J];计算机工程与应用;2002年19期
8 张迎辉;林学娋;;情感可以计算——情感计算综述[J];计算机科学;2008年05期
9 王志良,解仑,董平;情感计算数学模型的研究初探[J];计算机工程;2004年21期
10 谷学静,石志国,王志良;基于BDI Agent技术的情感机器人语音识别技术研究[J];计算机应用研究;2003年04期
,本文编号:1511078
本文链接:https://www.wllwen.com/kejilunwen/wltx/1511078.html