基于背景建模和属性学习的视频摘要研究
发布时间:2018-10-16 16:33
【摘要】:随着高清摄像设备的普及和物联网的兴起以及平安城市和智慧城市的提出,监控摄像头被广泛地部署在城市的每一个角落。监控设备可以在打击违法犯罪,维护社会长治久安上发挥重要的作用。然而海量的视频数据也在视频的存储归档和查阅检索上给人们带来巨大的考验。传统的直接存储和人工检索方式已经无法应对大规模视频的处理需求。如何解决海量视频的存储和检索的难题已经成为国内外学者研究的热点。因此本文针对这两个难题展开了相关研究。在查阅了大量国内外文献和资料之后,对视频存储和检索领域有了一定的了解,深入分析了课题的研究现状。阐述了当前研究工作的主要难点在于如何将监控视频中前景对象准确且无遗漏地检测出来;在检测出前景后如何对其进行多概念检测;在对多概念对象进行分类和描述时如何跨越语义鸿沟等。在此基础上本文提出了基于背景检测和属性学习的视频摘要方法。利用改进后的ViBe对视频序列进行背景建模,去除不包含前景对象的视频帧,将其余帧保留下来生成浓缩后的视频,以达到减少视频文件对存储造成的压力的目的;在获取到前景对象后建立属性分类器,利用属性学习对前景对象进行概念检测,检测出相应概念后利用属性标签来描述该前景对象,由此在浓缩的视频基础上生成视频摘要。本文研究的主要内容如下:(1)提出了基于改进ViBe的视频背景建模与浓缩。在对视频背景建模算法进行研究对比后,选择较其他主流方法速度快、占用内存少的ViBe算法。针对原ViBe算法在实际监控场景下仍会存在噪点和闪烁点以及在初始化过程中会引入鬼影的问题,对ViBe算法进行改进,分别提出了基于计数点阈值的闪烁点去除方法,基于形态学的噪点消除方法,和面向鬼影区域检测和抑制的改进算法。在实现并实验验证了对ViBe的改进后,将其应用于前景提取与视频浓缩中去。首先对视频进行背景建模,获取前景对象。而后将不包含前景对象的无用帧略去,以达到去除时间维度上的冗余信息的目的,对视频进行浓缩。(2)提出了基于多核属性学习的前景多概念检测与摘要。首先将多核学习引入直接属性预测模型框架中,给出了对核函数的权重向量进行优化求解方法;进一步地,将提出的模型运用视频对象分类中;继而利用模型的多概念分类能力和属性描述能力,对监控视频前景多概念进行检测,并给检测出的对象加上属性标签,生成视频摘要;最后,设计对比实验对提出方法的有效性进行验证。(3)在前面两个研究点的基础上,运用软件工程中面向对象的思路搭建基于背景建模和属性学习的视频摘要原型系统。系统包含视频浓缩模块、属性预测模型训练模块、视频摘要模块。运行效果良好,达到了本研究的预期目标。
[Abstract]:With the popularization of high-definition camera equipment and the rise of Internet of things and the introduction of Ping'an City and Smart City, surveillance cameras are widely deployed in every corner of the city. Monitoring equipment can play an important role in cracking down on crime and maintaining social stability. However, the huge amount of video data also brings people a great test in the storage, archiving and retrieval of video. Traditional methods of direct storage and manual retrieval can no longer cope with the need of large-scale video processing. How to solve the problem of mass video storage and retrieval has become a hot topic for scholars at home and abroad. Therefore, this paper has carried out the related research in view of these two difficult problems. After consulting a large number of domestic and foreign literature and materials, we have a certain understanding of video storage and retrieval field, in-depth analysis of the research status of the subject. The main difficulties of the current research work are how to detect the foreground objects accurately and without omission, how to detect the foreground objects accurately and how to detect them with multiple concepts after detecting the foreground. How to cross the semantic gap when classifying and describing multi-concept objects. On this basis, this paper proposes a video summarization method based on background detection and attribute learning. The improved ViBe is used to model the background of the video sequence, remove the video frames without foreground objects, and save the remaining frames to generate the condensed video, so as to reduce the pressure caused by the video files on the storage. After obtaining the foreground object, the attribute classifier is established, and the concept of foreground object is detected by using attribute learning, and then the foreground object is described by attribute label, and the video summary is generated on the basis of condensed video. The main contents of this paper are as follows: (1) the video background modeling and concentration based on improved ViBe is proposed. After studying and comparing the video background modeling algorithm, the ViBe algorithm, which is faster than other mainstream methods and occupies less memory, is selected. In view of the problem that the original ViBe algorithm still has noise and flicker points in the actual monitoring scene and the ghosts will be introduced in the initialization process, the ViBe algorithm is improved, and the method of removing the flashing points based on the count point threshold is proposed respectively. Morphology based noise cancellation method, and an improved algorithm for ghost region detection and suppression. After the implementation and experimental verification of the improved ViBe, it is applied to foreground extraction and video concentration. Firstly, the background of the video is modeled and the foreground object is obtained. Then the useless frame without foreground object is omitted to remove redundant information in time dimension and the video is condensed. (2) Multi-concept detection and summary of foreground based on multi-core attribute learning is proposed. Firstly, multi-kernel learning is introduced into the framework of direct attribute prediction model, and the optimization method of weight vector of kernel function is given. Furthermore, the proposed model is applied to video object classification. Then, the multi-concept classification ability and attribute description ability of the model are used to detect the multi-concept of the surveillance video foreground, and the detected objects are tagged with attributes to generate the video summary. A comparative experiment is designed to verify the effectiveness of the proposed method. (3) on the basis of the above two research points, a video abstract prototype system based on background modeling and attribute learning is built by using the object-oriented approach in software engineering. The system includes video enrichment module, attribute prediction model training module and video summary module. The operation effect is good and the expected goal of this study has been achieved.
【学位授予单位】:江苏大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP391.41
本文编号:2274982
[Abstract]:With the popularization of high-definition camera equipment and the rise of Internet of things and the introduction of Ping'an City and Smart City, surveillance cameras are widely deployed in every corner of the city. Monitoring equipment can play an important role in cracking down on crime and maintaining social stability. However, the huge amount of video data also brings people a great test in the storage, archiving and retrieval of video. Traditional methods of direct storage and manual retrieval can no longer cope with the need of large-scale video processing. How to solve the problem of mass video storage and retrieval has become a hot topic for scholars at home and abroad. Therefore, this paper has carried out the related research in view of these two difficult problems. After consulting a large number of domestic and foreign literature and materials, we have a certain understanding of video storage and retrieval field, in-depth analysis of the research status of the subject. The main difficulties of the current research work are how to detect the foreground objects accurately and without omission, how to detect the foreground objects accurately and how to detect them with multiple concepts after detecting the foreground. How to cross the semantic gap when classifying and describing multi-concept objects. On this basis, this paper proposes a video summarization method based on background detection and attribute learning. The improved ViBe is used to model the background of the video sequence, remove the video frames without foreground objects, and save the remaining frames to generate the condensed video, so as to reduce the pressure caused by the video files on the storage. After obtaining the foreground object, the attribute classifier is established, and the concept of foreground object is detected by using attribute learning, and then the foreground object is described by attribute label, and the video summary is generated on the basis of condensed video. The main contents of this paper are as follows: (1) the video background modeling and concentration based on improved ViBe is proposed. After studying and comparing the video background modeling algorithm, the ViBe algorithm, which is faster than other mainstream methods and occupies less memory, is selected. In view of the problem that the original ViBe algorithm still has noise and flicker points in the actual monitoring scene and the ghosts will be introduced in the initialization process, the ViBe algorithm is improved, and the method of removing the flashing points based on the count point threshold is proposed respectively. Morphology based noise cancellation method, and an improved algorithm for ghost region detection and suppression. After the implementation and experimental verification of the improved ViBe, it is applied to foreground extraction and video concentration. Firstly, the background of the video is modeled and the foreground object is obtained. Then the useless frame without foreground object is omitted to remove redundant information in time dimension and the video is condensed. (2) Multi-concept detection and summary of foreground based on multi-core attribute learning is proposed. Firstly, multi-kernel learning is introduced into the framework of direct attribute prediction model, and the optimization method of weight vector of kernel function is given. Furthermore, the proposed model is applied to video object classification. Then, the multi-concept classification ability and attribute description ability of the model are used to detect the multi-concept of the surveillance video foreground, and the detected objects are tagged with attributes to generate the video summary. A comparative experiment is designed to verify the effectiveness of the proposed method. (3) on the basis of the above two research points, a video abstract prototype system based on background modeling and attribute learning is built by using the object-oriented approach in software engineering. The system includes video enrichment module, attribute prediction model training module and video summary module. The operation effect is good and the expected goal of this study has been achieved.
【学位授予单位】:江苏大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP391.41
【参考文献】
相关期刊论文 前10条
1 魏广巨;;视频监控行业发展观察[J];中国安防;2016年11期
2 崔桐;徐欣;;一种基于语义分析的大数据视频标注方法[J];南京航空航天大学学报;2016年05期
3 兰红;周伟;齐彦丽;;动态背景下的稀疏光流目标提取与跟踪[J];中国图象图形学报;2016年06期
4 邓仕超;黄寅;;二值图像膨胀腐蚀的快速算法[J];计算机工程与应用;2017年05期
5 王辉;宋建新;;一种基于阈值的自适应Vibe目标检测算法[J];计算机科学;2015年S1期
6 何曦;瞿建荣;卢晓燕;刘培桢;王娇颖;;基于帧间差分和水平集的运动目标探测跟踪方法[J];探测与控制学报;2015年01期
7 王娟;蒋兴浩;孙锬锋;;视频摘要技术综述[J];中国图象图形学报;2014年12期
8 杨勇;孙明伟;金裕成;;一种改进视觉背景提取(ViBe)算法的车辆检测方法[J];重庆邮电大学学报(自然科学版);2014年03期
9 胡小冉;孙涵;;一种新的基于ViBe的运动目标检测方法[J];计算机科学;2014年02期
10 王文豪;周泓;严云洋;;一种基于连通区域的轮廓提取方法[J];计算机工程与科学;2011年06期
相关硕士学位论文 前2条
1 詹智财;基于卷积神经网络的视频语义概念分析[D];江苏大学;2016年
2 冯嘉;SIFT算法的研究和改进[D];吉林大学;2010年
,本文编号:2274982
本文链接:https://www.wllwen.com/shoufeilunwen/xixikjs/2274982.html