基于视频的人体动作分析与识别的研究

发布时间:2018-04-22 13:44

  本文选题:数字图像处理 + 人体动作识别 ; 参考:《电子科技大学》2015年博士论文


【摘要】:视频人体动作的分析与表示是计算机视觉领域的一个研究热点,其主要任务是从视频中检测、提取和表示人体运动信息,它涉及图像处理、机器学习、应用物理、数学等多个学科,具有重要的理论和实际应用价值。由于人体运动的复杂性和多样性,尽管经历了十几年的研究,视频人体动作识别仍然难以应用于实际环境。作为人体动作识别的核心,动作表示和识别仍然存在大量亟待解决的问题。本文开篇阐明了视频人体动作识别的研究背景、研究意义、主要任务以及典型模型,并从研究现状及存在问题两个方面出发,对运动检测、特征提取及描述,编码技术进行了简单讨论。在总结分析已有研究成果的基础上,本文主要内容包括四个方面:1)人体在时空中的运动会形成空间三维体,该三维体的形状信息是重要的人体运动信息,这种形状信息能够被局部邻域特征的位置关系所描述,为准确描述这种关系,我们提出两种局部邻域特征构造算法:基于正多面体的局部时空邻域特征,和基于多尺度的时空方向邻域特征。前者是利用正多面的多个空间轴作为特征位置的参考定位系统,精确描述局部特征相对位置信息。后者是在局部邻域的构造中引入时空尺度参数,使得邻域特征具有方向选择性。2)协方差特征是一种强有力的局部特征,本论文我们将局部人体运动信息表示为协方差特征,然后研究它在两种情况下的动作识别率:第一种情况,我们首先使用矩阵对数映射,将协方差从黎曼空间映射到Log-Euclidean空间,然后在Log-Euclidean空间进行聚类、编码操作;第二种情况,为保持协方差特征在黎曼流形上的几何结构信息,我们直接对协方差矩阵在黎曼流形上进行聚类操作,生成黎曼矩阵字典,然后使用提出的局部黎曼流形编码算法实现特征编码。此外,我们还对不同矩阵距离度量下,协方差聚类中的批量均值更新和顺序均值更新做了深入研究。3)基于Grassmann随机流形森林的人体动作识别。传统局部时空特征利用时空网格划分局部时空体,然后分别计算每个网格的特征统计量,最后级联所有网格的特征统计量,获得局部特征描述子。这种网格划分不仅破坏了帧与帧之间的时间关联性,而且网格尺度没有统一标准,需要依靠经验和实验确定。为解决该问题,我们直接将每帧图像拉成列向量,局部时空立方体被表示为列向量矩阵,为度量这些矩阵的相似度,我们使用Grassmann流形距离,然后利用Grassmann随机流形树描述Grassmann流形的数据概率分布信息和实现人体动作分类。4)特征编码在动作识别中占据重要地位,一直以来都是研究的热点。我们通过对经典局部约束线性编码(Locality-constrained Linear Coding,LLC)算法的研究,提出一种LLC的加权版本,即WLLC编码算法。LLC算法是近来提出的一种优秀稀疏编码,它的优点包括编码是稀疏的、编码速度快、重构误差小,主要缺点是在其字典生成阶段完全抛弃了数据聚类中心附近样本的概率分布信息,使得在编码阶段每个被选中的单词对编码的贡献是一样的。我们所提WLLC算法的基本思想是,由于每个单词(聚类中心)周围训练样本分布的差异,使得它们的可信度不同,在特征编码中,高可信度的单词应该对编码做出更大的贡献。实验证明,通过引入WLLC编码算法,动作识别率被有效提高。此外,特征位置信息对于动作识别具有重要意义,为此,我们提出一种混合特征,配合提出的多尺度空间位置编码算法,达到准确描述人体动作在时空中的概率分布信息。论文最后对视频人体动作的分析与表示进行了展望,并提出下一步工作的主要内容。
[Abstract]:The analysis and representation of video human motion is a hot topic in the field of computer vision. Its main task is to detect, extract and express human motion information from video. It involves image processing, machine learning, applied physics, mathematics and other disciplines. It has important theory and practical application value. In spite of more than ten years of research, video human motion recognition is still difficult to apply to the actual environment. As the core of human action recognition, there are still a lot of problems to be solved in action representation and recognition. This paper expounds the research background, research significance, main tasks and typical models of video human action recognition. Based on the two aspects of the research status and existing problems, the motion detection, feature extraction and description and coding technology are discussed briefly. On the basis of summarizing and analyzing the existing research results, the main contents of this paper include four aspects: 1) the movement of the human body in time and space will form a three-dimensional body, and the shape information of the three-dimensional body is heavy. For human motion information, this shape information can be described by the location relationship of local neighborhood features. In order to accurately describe the relationship, we propose two local neighborhood feature construction algorithms, based on the local spatiotemporal neighborhood features of the positive polyhedron, and the multi-scale based spatio-temporal neighborhood characteristics. As a reference location system, the spatial axis accurately describes the relative position information of local features. The latter is the introduction of time and space parameters in the construction of local neighborhood, which makes the neighborhood features.2) covariance feature is a powerful local feature. In this paper, we represent the local human motion information as a covariance. Characteristic of variance, and then study the motion recognition rate in two cases: first, we first use matrix logarithmic mapping to map covariance from Riemann space to Log-Euclidean space, then cluster, code operation and second conditions in Log-Euclidean space to keep the geometric knot of covariance features on the Riemann manifold. In order to construct the information, we directly cluster the covariance matrix on the Riemann manifold, generate the Riemann matrix dictionary, and then use the proposed local Riemann manifold coding algorithm to implement the feature coding. In addition, we also make a thorough study of the mean renewal of the batch and the sequential mean renewal in the covariance clustering under the different matrix distance metrics.3 ) the human movement recognition based on the Grassmann random manifold forest. The traditional local spatiotemporal features are divided into the local space-time bodies using the space-time grid, then the characteristic statistics of each grid are calculated respectively. Finally, the feature statistics of all the grids are concatenated, and the local feature descriptors are obtained. This mesh division not only destroys the time between frames and frames. In order to solve the problem, we directly pull each frame into a column vector, and the local space-time cube is expressed as a column vector matrix to measure the similarity of these matrices. We use the Grassmann manifold distance and then use the Grassmann random manifold tree to describe the problem. The data probability distribution information of Grassmann manifolds and the implementation of human action classification.4) feature encoding play an important role in the action recognition. It has always been the hot spot of research. By the study of the classical local constrained linear coding (Locality-constrained Linear Coding, LLC), a weighted version of LLC, that is, WLLC, is proposed. The coding algorithm.LLC algorithm is an excellent sparse coding which has been proposed recently. Its advantages include the sparse coding, fast coding speed and small reconstruction error. The main disadvantage is that the probability distribution information of the samples near the data cluster center is completely abandoned in the phase of the dictionary generation, so that each selected word is coded at the coding stage. The contribution is the same. The basic idea of the WLLC algorithm we propose is that because of the differences in the distribution of training samples around each word (cluster center), their credibility is different. In feature coding, high reliability words should make a greater contribution to the coding. The experimental evidence shows that the action recognition rate is introduced by introducing the WLLC coding algorithm. In addition, the feature location information is of great significance to the action recognition. For this reason, we propose a hybrid feature, combined with the proposed multi-scale spatial location coding algorithm, to accurately describe the probability distribution information of human movements in time and space. The main content of the next step of the work.

【学位授予单位】:电子科技大学
【学位级别】:博士
【学位授予年份】:2015
【分类号】:TP391.41

【共引文献】

相关期刊论文 前10条

1 陈文;基于决策树的入侵检测的实现[J];安徽技术师范学院学报;2005年05期

2 彭莉芬;陈俊生;胡学钢;;基于粗糙集决策树算法的研究[J];安庆师范学院学报(自然科学版);2012年01期

3 赵玉鹏;;论机器学习[J];安阳工学院学报;2011年04期

4 孙雪;李昆仑;胡夕坤;赵瑞;;基于半监督K-means的K值全局寻优算法[J];北京交通大学学报;2009年06期

5 赵勇;刘凯;;数字挖掘方法在遥感分类中的应用研究[J];北京测绘;2009年03期

6 沈奕,滑峰,刘椿年;基于GDT的对FOIL系统的改进[J];北京工业大学学报;2005年02期

7 朱青;刘宇辉;;一种面向领域的组件质量度量算法[J];北京工业大学学报;2007年01期

8 陈阳舟;黄旭;代桂平;;基于新的状态划分的多机器人围捕策略[J];北京工业大学学报;2010年08期

9 张瑞华;周延泉;王枞;李蕾;;移动终端离线浏览系统的新闻推荐服务研究[J];北京邮电大学学报;2006年06期

10 杨种学;;基于回归技术商品销售趋势预测模型的实现[J];保山师专学报;2009年05期

相关博士学位论文 前10条

1 全惠敏;电能质量相关信号的S变换检测算法及应用研究[D];湖南大学;2010年

2 高山;蛋白质点突变效果预测与突变数据库研究[D];南开大学;2010年

3 曹葵康;支持向量机加速方法及应用研究[D];浙江大学;2010年

4 林龙信;仿生水下机器人的增强学习控制方法研究[D];国防科学技术大学;2010年

5 杜伟;机器学习及数据挖掘在生物信息学中的应用研究[D];吉林大学;2011年

6 聂黎;基于基因表达式编程的车间动态调度方法研究[D];华中科技大学;2011年

7 杨抒;基于WEB的林产品信息资源整合方法研究[D];北京林业大学;2011年

8 黄静华;支持向量机算法研究及在气象数据挖掘中的应用[D];中国矿业大学(北京);2011年

9 陈俊;笑脸表情分类识别的研究[D];华南理工大学;2011年

10 刘宏兵;多目标粒度支持向量机及其应用研究[D];武汉理工大学;2011年



本文编号:1787483

资料下载
论文发表

本文链接:https://www.wllwen.com/shoufeilunwen/xxkjbs/1787483.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户259db***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com