基于骨架信息的人体动作识别

发布时间：2018-04-15 14:03

本文选题：人体动作识别 + 人体骨架　；参考：《中国科学技术大学》2017年硕士论文

【摘要】：人体动作识别是计算机视觉领域一大热门的研究方向。其主要目的是对视频中的人体动作进行正确地分类。这项技术可以被应用到智能视频监控、人机自然交互、运动视频分析以及无人驾驶等领域。然而如何构造有效的特征来对视频当中的人体动作进行描述一直是一个非常有挑战性的问题。本文通过对人体骨架信息进行深入挖掘,提出了基于关节点位置的动力学和关系特征,这组特征由4大类特征、36种子特征构成。1.关节动力学特征:这一大类特征由速度、加速度、角速度、角加速度、速率、加速率、动能、动能变化、重力势能、重力势能变化、总能量、总能量变化、归一化位置等13种子特征构成,这组特征从关节点运动和能量变化的角度出发,充分地挖掘了人体骨架的动力学信息。2.相关关系特征:这一大类特征由速度相关关系、加速度相关关系、角速度相关关系、角加速度相关关系,能量变化相关关系等5种子特征构成,这组特征描述了任意一对关节点之间的运动相关关系和能量变化相关关系。3.距离关系特征:这一大类特征由水平距离关系及其轨迹、垂直距离关系及其轨迹、方向正弦距离关系及其轨迹、方向余弦距离关系及其轨迹、特征向量方向距离关系及其轨迹、连通距离关系及其轨迹等12种子特征构成,这组特征描述了任意一对关节点在特定方向上的距离关系。4.几何关系特征:这一大类特征由关节向量内积及其轨迹、关节向量余弦相关性及其轨迹、关节三角形面积周长比及其轨迹等6种子特征构成构成,这组特征描述了任意三个关节点之间几何关系。将这些特征合并在一起构成基于关节点位置的动力学和关系特征。本文对这组特征的各个子特征进行了全面的比较。这组特征在JHMDB数据集、sub-JHMDB数据集和Penn Action数据集上均取得了不错的效果。此外,由于动作识别系统中每个环节都会对最后的识别结果产生一定的影响,因此本文探索了适合基于关节点位置的动力学和关系特征的动作识别算法框架。其中最合适的词袋模型为基于K均值聚类和向量量化的词袋模型,最有效的分类模型为多通道的RBF-χ2核的支持向量机。总而言之,通过充分挖掘骨架信息,本文提出了一组基于关节点位置的动力学和相关关系特征,并探索了适合这一特征的词袋模型和分类模型。通过充分的实验验证了这组特征的有效性,也为下一步利用基于骨架信息对人体动作进行识别的研究工作提供了建议。
[Abstract]:Human motion recognition is a hot research direction in the field of computer vision.Its main purpose is to correctly classify human actions in video.This technology can be applied to intelligent video surveillance, human-computer natural interaction, motion video analysis and driverless.However, how to construct effective features to describe human actions in video has been a very challenging problem.In this paper, a dynamic and relational feature based on the location of the gate node is proposed by mining the skeleton information of the human body.Joint dynamics: this broad category of characteristics consists of velocity, acceleration, angular velocity, angular acceleration, rate, acceleration rate, kinetic energy, kinetic energy change, gravity potential energy, gravity potential energy change, total energy, total energy change,The normalized position and other 13 seed features are formed. This set of features fully excavates the dynamic information of the human skeleton from the point of view of the movement and energy change of the node.Correlation characteristics: this kind of feature is composed of five characteristics: velocity correlation, acceleration correlation, angular velocity correlation, angular acceleration correlation, energy variation correlation, etc.This set of features describes the kinematic and energy-dependent relationships between any pair of nodes.The feature of distance relation: this kind of feature consists of horizontal distance relation and its trajectory, vertical distance relation and its trajectory, directional sinusoidal distance relationship and its trajectory, directional cosine distance relationship and its trajectory, characteristic vector directional distance relation and its trajectory.The connected distance relation and its trace are composed of 12 seed features, which describe the distance relation of any pair of nodes in a particular direction.Geometric relation feature: this kind of feature is composed of 6 seed features, such as joint vector inner product and its trajectory, joint vector cosine correlation and its trajectory, joint triangle area / circumference ratio and its trajectory, etc.This set of features describes the geometric relationship between any three nodes.These features are combined to form dynamic and relational features based on the location of the node.This paper makes a comprehensive comparison of each subfeature of this set of features.This set of features has achieved good results on JHMDB data sets sub-JHMDB datasets and Penn Action datasets.In addition, because every link in the motion recognition system will have a certain impact on the final recognition results, this paper explores an action recognition algorithm framework that is suitable for dynamic and relational features based on the location of the node.The most suitable word bag model is based on K-means clustering and vector quantization, and the most effective classification model is multi-channel RBF- 蠂 2 kernel support vector machine.In a word, by fully mining the skeleton information, this paper proposes a set of dynamics and correlation features based on the location of the node, and explores the word bag model and classification model suitable for this feature.The validity of this set of features is verified by experiments, and some suggestions are provided for the further research on the recognition of human actions based on skeleton information.
【学位授予单位】：中国科学技术大学
【学位级别】：硕士
【学位授予年份】：2017
【分类号】：TP391.41

【相似文献】