基于哈希的多目标跟踪算法的研究

发布时间：2018-01-06 00:34

本文关键词：基于哈希的多目标跟踪算法的研究　出处：《安徽大学》2017年硕士论文　论文类型：学位论文

【摘要】：随着图像处理领域的蓬勃发展,多目标跟踪作为图像处理的重要研究方向也取得了巨大的进展,使得多目标跟踪技术可以成功应用到各种实时视频场景分析中,比如无人驾驶、无人机等。但是目前多目标跟踪算法中仍然存在着遮挡、目标数量不确定、数据关联、实时性要求等难题。为了解决目标数量不确定的问题,需要用一个性能优秀的目标检测器对视频序列进行检测,以获得每帧图片出现的目标位置以及数量。因此本文首先采用卷积神经网络结合选择性搜索算法对视频序列进行行人检测。传统的目标检测算法一般首先提取目标的人工特征,然后使用该特征训练得到一个分类器,最后使用滑动窗口得到候选区域并对其分类。但是传统的目标检测算法具有以下缺陷:其一人工特征提取方法复杂,并且需要设计者具备一定的先验知识,才能得到对目标描述较好的特征。其二传统的目标检测算法将特征提取过程与分类过程独立开,如果提取的特征的描述性不够充分,那么分类算法也无法取得较好的效果。与传统的目标检测算法相比,卷积神经网络不需要输入复杂的人工特征,可以直接输入样本图片,通过卷积运算自主学习得到更自然、更通用的样本特征,而且得到的特征对于形变具有一定的不变性,因此使得卷积神经网络广泛的应用于目标检测中。本文在经典的卷积神经网络模型LeNet-5上进行改进并借助于Caffe框架搭建卷积神经网络,然后通过在常用的行人检测数据集中选取样本构成数据集,并在此数据集上进行对比实验,通过实验表明,将卷积神经网络应用到行人检测中能取得很好的效果。为了解决目标被遮挡的问题,本文将跟踪目标从出现到离开摄像头拍摄范围的过程分为初始、跟踪、丢失、结束四种状态,然后对处于不同状态的目标进行不同的处理以解决遮挡问题。为了解决数据关联、实时性要求等难题,本文使用哈希算法对检测对象的图像特征进行编码,得到检测对象的哈希码,然后使用哈希码内积衡量当前帧检测对象与前一时刻跟踪目标之间的相似度,选取相似度最大的组合完成目标关联。简化了算法的复杂度的同时能够完成它们之间的关联。为了提高目标关联的准确性,基于跟踪目标就后帧必然在空间上存在连续这一先验知识,本文将当前帧检测对象与前一帧跟踪目标之间的质心距离也作为相似度的衡量标准。最后本文在MOT Benchmark数据集上进行实验,与其他多目标跟踪算法进行对比,对算法的有效性进行验证。最后对本文的研究内容进行总结,并且根据实验结果对本文提出的基于哈希算法的多目标跟踪算法的不足之处提出下一步的改进方法。
[Abstract]:With the rapid development in the field of image processing, target tracking as an important research direction of image processing has made tremendous progress, the multi-target tracking technology can be successfully applied to a variety of real-time video scene analysis, such as unmanned drones, etc. but the multi-target tracking algorithm still exist in the shelter, the target number uncertain data association, the requirements of the real-time problem. In order to solve the problem of determining the number of goals, it is necessary to detect the video sequence with an excellent target detector to obtain each frame picture the target location and quantity. So this paper adopts convolution neural network combined with selective search algorithm for pedestrian detection video sequence. Artificial target detection algorithm of traditional feature extraction of target first, then use this feature to train a classifier, the most After the sliding window is used to get the candidate region and its classification. But the traditional target detection algorithm has the following defects: the artificial feature extraction method is complicated, and designers need to have certain prior knowledge, in order to get better describe the characteristics of the target. The target detection algorithm the traditional feature extraction process and classification process independently. If the extracted feature description is not sufficient, then the classification algorithm can achieve better results. Compared with the traditional detection algorithm, convolutional neural network does not need to manually input features of complex, you can directly enter the sample images, through the convolution of autonomous learning is more natural, more general and sample characteristics, has the characteristics of the invariance for the deformation, so that the application of convolutional neural networks in a wide range of target detection. Based on the classic volume To improve and use the Caffe framework to build product convolutional neural network LeNet-5 neural network model, and then through the common data set pedestrian detection sample data sets, and compared the experimental data set, experiments show that the convolution neural network is applied to pedestrian detection can achieve good results. In order to solve the target is blocked, the target tracking from appear to leave the camera shooting range, the process is divided into initial, tracking, lost, the end of the four states, and in different states of different target to solve the occlusion problem. In order to solve the data association, the requirements of the real-time problem, the image feature detection the object of encoding using a hashing algorithm to get the hash code of the detection object, and then use the hash code of the current frame detection object and measure the inner product of a moment ago with The similarity between the target tracking, select combination of maximum similarity target association. To simplify the complexity of the algorithm can be finished at the same time the association between them. In order to improve the accuracy of target tracking target is related, after this frame continuous prior knowledge in space based on the inevitable, the current frame with the previous object detection frame tracking between the target centroid distance as similarity measure. Finally, this paper makes experiments on MOT Benchmark data sets, and other multi target tracking algorithm are compared, verified the effectiveness of the algorithm. Finally, the research contents of this paper are summarized, put forward the improvement method of the next step and according to the experimental results of this paper the proposed multi target tracking algorithm based on hash algorithm's shortcomings.

【学位授予单位】：安徽大学
【学位级别】：硕士
【学位授予年份】：2017
【分类号】：TP391.41

【参考文献】