融合手势与语音的多通道标绘交互技术研究
发布时间:2018-09-08 14:04
【摘要】:随着多媒体技术和虚拟现实技术的发展,人机环境中信息的输出形式更加丰富,同时也使用户所要面对的交互对象和交互内容变得更加复杂,传统的交互方式无法达到和谐、自然与人性化的交互要求。在军事应用领域,计算机辅助标绘是是一个典型需求,亟需研究其他交互方式来提高标绘交互自然性。本文融合手势与语音识别技术,对书空手势指令进行定义和识别,构建语音交互任务词汇的状态转移矩阵,采用任务制导的方式整合不同通道的交互信息,提出了基于任务槽结构的多通道整合模型,对交互任务和操作进行分析和设计,最后对交互任务进行综合实验。本文的主要工作和创新点有:一、提出了一种基于方向链码的书空手势识别算法,实现空间手势识别。采用Leap Motion进行自定义的手势识别和匹配,通过自定义手势指令,对其自身有限的手势识别指令进行扩充。为了消除手势输入过程中的不稳定性而导致的噪声干扰,对手势轨迹进行分段处理,由分段的比重确定主要移动方向描述输入手势,根据手势的相同分段对输入手势与模板手势通过顺序匹配算法进行匹配。二、在语音命令识别的基础上,提出了基于命令转移概率的语音任务组织方法,辅助语音命令识别和组织。根据语法规则和语义对交互任务语音词汇进行分类,剔除语音交互任务中任务动作的生僻词。通过场景语义上下文分析,确定当前场景中的交互对象及交互任务,采用马尔可夫状态转移概率矩阵分析词汇间的连接关系,排除异常输入的关键词,使系统能正确地理解用户的语音交互意图。三、提出了基于对象属性的多通道任务槽结构整合模型。对交互任务进行分析和设计,确定不同交互任务的任务槽的所需信息。用户与传感器进行元操作的交互,通过分层语义提取,将交互数据转换为能够被系统识别的任务所需的属性信息。根据属性类型的不同,将交互信息再填充到任务槽中相应的模块,构成系统可识别的交互语义,从而识别整个交互任务并交由计算机执行任务,实现系统的交互功能。
[Abstract]:With the development of multimedia technology and virtual reality technology, the output form of information in man-machine environment is more abundant, meanwhile, the interaction object and content that users have to face become more complex, and the traditional interaction mode can not achieve harmony. Natural and human interaction requirements. In the field of military application, computer-aided plotting is a typical demand, so it is urgent to study other interactive methods to improve the natural nature of plotting interaction. This paper combines gesture and speech recognition technology, defines and recognizes the gesture instructions in the book space, constructs the state transition matrix of speech interactive task vocabulary, and integrates the interactive information of different channels by task-guided way. A multi-channel integration model based on task-slot structure is proposed to analyze and design interactive tasks and operations. Finally, a comprehensive experiment on interactive tasks is carried out. The main work and innovations of this paper are as follows: first, a novel algorithm of bookspace gesture recognition based on directional chain code is proposed to realize spatial gesture recognition. The self-defined gesture recognition and matching are carried out by Leap Motion, and the limited gesture recognition instruction is extended by using the self-defined gesture instruction. In order to eliminate the noise disturbance caused by the instability in gesture input, the gesture trajectory is segmented, and the main moving direction is determined to describe the input gesture. Input gesture and template gesture are matched by sequential matching algorithm according to the same segment of gesture. Secondly, on the basis of speech command recognition, a speech task organization method based on command transfer probability is proposed to assist speech command recognition and organization. According to the grammar rules and semantics, the phonetic vocabulary of interactive task is classified, and the unfamiliar words of task action in phonetic interaction task are eliminated. Through scene semantic context analysis, the interaction objects and interaction tasks in the current scene are determined. Markov state transition probability matrix is used to analyze the connection between words, and the keywords of abnormal input are excluded. So that the system can correctly understand the user's voice interaction intention. Thirdly, a multi-channel task slot structure integration model based on object attributes is proposed. Analyze and design interactive tasks to determine the information needed for different task slots. The user interacts with the sensor in meta-operation and transforms the interactive data into the attribute information needed by the task recognized by the system through hierarchical semantic extraction. According to the different attribute types, the interactive information is filled into the corresponding module in the task slot to form the interactive semantics which can be recognized by the system, so that the whole interactive task is recognized and the task is executed by the computer, and the interactive function of the system is realized.
【学位授予单位】:国防科学技术大学
【学位级别】:硕士
【学位授予年份】:2014
【分类号】:TN912.3
本文编号:2230742
[Abstract]:With the development of multimedia technology and virtual reality technology, the output form of information in man-machine environment is more abundant, meanwhile, the interaction object and content that users have to face become more complex, and the traditional interaction mode can not achieve harmony. Natural and human interaction requirements. In the field of military application, computer-aided plotting is a typical demand, so it is urgent to study other interactive methods to improve the natural nature of plotting interaction. This paper combines gesture and speech recognition technology, defines and recognizes the gesture instructions in the book space, constructs the state transition matrix of speech interactive task vocabulary, and integrates the interactive information of different channels by task-guided way. A multi-channel integration model based on task-slot structure is proposed to analyze and design interactive tasks and operations. Finally, a comprehensive experiment on interactive tasks is carried out. The main work and innovations of this paper are as follows: first, a novel algorithm of bookspace gesture recognition based on directional chain code is proposed to realize spatial gesture recognition. The self-defined gesture recognition and matching are carried out by Leap Motion, and the limited gesture recognition instruction is extended by using the self-defined gesture instruction. In order to eliminate the noise disturbance caused by the instability in gesture input, the gesture trajectory is segmented, and the main moving direction is determined to describe the input gesture. Input gesture and template gesture are matched by sequential matching algorithm according to the same segment of gesture. Secondly, on the basis of speech command recognition, a speech task organization method based on command transfer probability is proposed to assist speech command recognition and organization. According to the grammar rules and semantics, the phonetic vocabulary of interactive task is classified, and the unfamiliar words of task action in phonetic interaction task are eliminated. Through scene semantic context analysis, the interaction objects and interaction tasks in the current scene are determined. Markov state transition probability matrix is used to analyze the connection between words, and the keywords of abnormal input are excluded. So that the system can correctly understand the user's voice interaction intention. Thirdly, a multi-channel task slot structure integration model based on object attributes is proposed. Analyze and design interactive tasks to determine the information needed for different task slots. The user interacts with the sensor in meta-operation and transforms the interactive data into the attribute information needed by the task recognized by the system through hierarchical semantic extraction. According to the different attribute types, the interactive information is filled into the corresponding module in the task slot to form the interactive semantics which can be recognized by the system, so that the whole interactive task is recognized and the task is executed by the computer, and the interactive function of the system is realized.
【学位授予单位】:国防科学技术大学
【学位级别】:硕士
【学位授予年份】:2014
【分类号】:TN912.3
【参考文献】
相关期刊论文 前10条
1 曹磊;;一种三维空间手写数字的融合识别方法[J];淮北师范大学学报(自然科学版);2013年04期
2 马建平;潘俊卿;陈渤;;Android智能手机自适应手势识别方法[J];小型微型计算机系统;2013年07期
3 张仲一;杨成;吴晓雨;;基于Kinect的隔空人手键盘输入[J];中国传媒大学学报(自然科学版);2013年03期
4 俞烈彬;孟凡文;;武器装备系统中的人机交互新技术[J];电子世界;2013年12期
5 陈艳艳;陈正鸣;周小芹;;基于Kinect的手势识别及在虚拟装配技术中的应用[J];电子设计工程;2013年10期
6 聂岩峰;田田;吴昊;;基于GIS指挥决策系统的多通道交互研究[J];计算机与现代化;2013年01期
7 赖英超;曾剑铭;沈海斌;;基于连笔消除的空间手写字符识别方法[J];计算机工程;2012年19期
8 严军;陈晓丹;沈海斌;;基于时频融合特征的3D空间手写识别[J];计算机工程;2012年18期
9 张毅;张烁;罗元;徐晓东;;基于Kinect深度图像信息的手势轨迹识别及应用[J];计算机应用研究;2012年09期
10 蓝贵文;李景文;;基于ArcGIS Engine的可扩展地图标绘系统[J];桂林理工大学学报;2010年04期
,本文编号:2230742
本文链接:https://www.wllwen.com/kejilunwen/wltx/2230742.html