基于视频深度学习的人物行为分析与社交关系识别

发布时间：2018-03-04 10:08

本文选题：人物行为语义　切入点：社交关系　出处：《南京邮电大学》2017年硕士论文　论文类型：学位论文

【摘要】：识别视频中人物行为与社交关系是理解视频语义的重要任务,其主要难点在于如何通过运用深度学习等算法来分析和整合与人物行为相关的视频语义线索。近年来传统的深度学习算法在简单静态图片识别方面取得了突出成就,但是仍不能满足视频中复杂的人物行为与社交关系识别要求。本学位论文以识别视频中的人物行为语义与社交关系为研究目标,首先提出一种基于长短期记忆(LSTM)模型的语义识别算法来识别视频中人物行为,再通过一种基于无向有权图的节点聚类算法来完成视频中人物社交分组,最后通过部分标记因子图模型(SPLP-FGM)来推断视频中人物的社交关系。此外,本文在微软视频描述语料、电影描述的语料库两个数据集上进行人物行为语义识别实验,在电视剧Friends数据集上进行人物社交关系识别实验。实验结果表明,本文提出的基于LSTM模型的语义识别算法能够高效和全面地识别视频中人物的行为语义,部分标记因子图模型能够有效地识别视频中人物之间的社交关系。本文的工作创新主要体现在以下三个方面:(1)利用卷积神经网络并行地抽取每个视频场景中的人物身份、人物动作和上下文等三个方面的中层语义特征,通过两层循环神经网络来融合这三个方面的语义信息来完成视频中人物行为语义的识别;(2)将视频中的人物社交映射成无向有权图,通过一种基于无向有权图的节点聚类算法来完成视频中人物的社交分组;(3)在完成视频中人物的社交分组和行为语义识别的基础上,通过构建和学习部分标记因子图模型来推断视频中所有的未知人物社交关系。
[Abstract]:It is an important task to understand the meaning of video to identify the relationship between the behavior and social relationship of the characters in the video. The main difficulty lies in how to analyze and integrate the video semantic clues related to the behavior of characters by using deep learning algorithms. In recent years, traditional depth learning algorithms have made outstanding achievements in simple static image recognition. However, it still can not meet the requirements of complex character behavior and social relationship recognition in video. This dissertation aims to identify the semantic and social relationship of character behavior in video. Firstly, a semantic recognition algorithm based on LSTM (long and short memory) model is proposed to identify the behavior of characters in video. Then, a node clustering algorithm based on undirected weighted graph is proposed to realize the social grouping of characters in video. Finally, the social relationship of the characters in the video is inferred by the partial tagging factor graph model (SPLP-FGM). In addition, the experiment of character behavior semantic recognition is carried out on the two data sets of Microsoft video description corpus and movie description corpus. The experiment of character social relationship recognition on TV series Friends dataset shows that the proposed semantic recognition algorithm based on LSTM model can effectively and comprehensively recognize the behavioral semantics of the characters in the video. Part of the tagging factor graph model can effectively identify the social relationship between the characters in the video. The work innovation of this paper is mainly reflected in the following three aspects: 1) using convolution neural network to extract the identity of the characters in each video scene in parallel. The middle semantic features of character action and context, The two-layer cyclic neural network is used to fuse the semantic information of these three aspects to realize the semantic recognition of the behavior of the characters in the video. (2) the social interaction of the characters in the video is mapped into an undirected weighted graph. In this paper, a node clustering algorithm based on undirected weighted graph is used to realize the social grouping of the characters in the video, which is based on the recognition of the social grouping and behavioral semantics of the characters in the video. By constructing and learning partial marker factor graph model, we infer the social relationship of all unknown characters in the video.
【学位授予单位】：南京邮电大学
【学位级别】：硕士
【学位授予年份】：2017
【分类号】：TP391.41

【相似文献】