当前位置:主页 > 科技论文 > 软件论文 >

跨视域摄像头网络下的监控视频结构化与检索

发布时间:2018-11-27 10:12
【摘要】:视频监控是城市公共安全领域一项重要的监控手段。随着监控摄像头数目和监控视频数据量的急剧上升,传统基于人工操作的监控方式越来越难以满足需求,亟需发展基于智能算法的视频监控技术。智能视频监控中的关键问题在于"监控视频内容结构化"与"监控对象检索"。围绕这两大关键问题,本文(1)针对监控视频内容结构化中的目标元数据获取问题,开展了群体目标跟踪的研究;(2)针对监控视频内容结构化中的目标理解与描述问题,开展了图像多属性识别的研究;(3)针对监控对象检索中的基于图像的检索问题,开展了跨视域行人群组再识别的研究。群体目标跟踪获取了每个行人的运动视频片段和运动轨迹信息,为后续分析处理提供了重要的素材。图像多属性识别为每个监控对象生成了高层语义描述信息,一方面为基于图像的检索提供了高层语义特征,另一方面为基于自然语言的检索提供了可能。跨视域行人群组再识别的研究是对单行人再识别问题的重要补充,为视频监控中基于行人外观特征(非人脸)的跨视域行人检索应用提供了重要的技术基础。本论文的主要研究工作与创新成果如下:(1)提出了一种基于群组关系演化的群体目标跟踪算法。该算法将低层次(Low-Level)的关键点跟踪、中层次(Mid-Level)的图像块检测及跟踪和高层次(High-Level)的群组关系演化融入一个统一框架。不同于以往的计算光流、跟踪关键点或者检测行人目标,本文提出将人群表示成一组外观独特且稳定的图像块。在低层次上,关键点跟踪提供了非常精确的局部轨迹信息,可以用于检测图像块以及推测群体的群组关系。在中层次上,采用所提出的分层树形结构对图像块之间的空间关系进行建模和学习。在高层次上,群组关系的演化使得分层树形结构可以通过分裂、合并等形式进行动态更新。实验结果表明:所提出的图像块检测方法为给定目标的跟踪提供了重要的辅助信息;所提出的动态分层树形结构能够有效学习目标之间的空间关系;所提出的基于群组关系演化的群体目标跟踪算法显著提高了群体目标跟踪的准确性。(2)提出了一种基于空间几何关系的图像多属性识别算法。该算法通过一个可以"端到端"训练的深层卷积神经网络来同时学习属性之间的空间和语义关系,而仅仅利用了图像的属性标签类别信息作为训练监督信号。具体来说,对于输入图像,使用所提出的"空间正则网络"(SRN:Spatial Regularization Network)为每个可能的属性类别标签生成一个注意力图,并基于注意力图来同时学习属性之间的空间和语义关系。最后,将"空间正则网络"得到的各个属性的置信度得分与基本卷积神经网络(如:残差网络ResNet-101)得到的置信度得分进行加和,修正属性置信度得分。在多个不同类型的公开数据集上的实验结果表明:"空间正则网络"可以有效学习图像中属性之间的空间几何关系;这种空间几何关系可以显著提升图像多属性识别的准确性。(3)提出了一种基于块匹配的行人群组再识别算法。相对于单行人再识别问题,行人群组再识别面临着更多的新问题,比如:群组内行人之间严重的相互遮挡、群组内行人在不同视域下发生相对位置变化等。为了解决上述问题,本文提出将行人群组再识别建模成两组图像块匹配的问题。首先,通过所提出的显著性通道滤除掉外观相似度不高或者不具判别能力的图像块匹配;然后,对于生成的候选匹配,采用所提出的空间一致性匹配进行进一步筛选,滤除掉空间匹配关系不一致的图像块匹配,最终得到两张图像的相似度。实验结果表明:所提出的算法在性能上显著超过了目前主流的目标再识别算法;所提出算法的两个部分(显著性通道和空间一致性匹配)在行人群组再识别性能的提升上相互促进。
[Abstract]:Video monitoring is an important monitoring method in the field of urban public safety. With the rapid increase of the number of monitoring cameras and the amount of video data, the traditional monitoring methods based on manual operation are becoming more and more difficult to meet the demand, and the video monitoring technology based on the intelligent algorithm is urgently needed. The key problem in intelligent video monitoring is the "Monitor video content structuring" and the "Monitoring Object Retrieval". In order to solve the problem of target metadata acquisition in the structure of video content, this paper has carried out the research of group target tracking, and (2) to monitor the problem of target understanding and description in the structure of video content. The research of multi-attribute recognition of image is carried out; and (3) the research on the re-identification of the crowd group across the visual field is carried out in view of the image-based retrieval problem in the object retrieval. The target tracking of the group acquires the motion video clip and the motion track information of each pedestrian, and provides important material for subsequent analysis and processing. The multi-attribute recognition of the image provides high-level semantic description information for each monitoring object, on the one hand, provides high-level semantic features for image-based retrieval, and on the other hand, provides a possibility for retrieval based on natural language. The research on the re-identification of the cross-view line population group is an important supplement to the problem of the rerecognition of the single-line person, and provides an important technical basis for the application of the cross-view pedestrian search based on the pedestrian appearance characteristics (non-human face) in the video monitoring. The main research work and innovation achievement of this thesis are as follows: (1) A group target tracking algorithm based on group relation evolution is proposed. The algorithm combines the low-level key tracking, mid-level (mid-level) image block detection and tracking and high-level (High-Level) group relationship evolution into a unified framework. different from the conventional calculation light flow, the tracking key point, or the detection of the pedestrian target, the present invention proposes to represent the population as a group of image blocks that are unique and stable in appearance. At the low level, the key tracking provides very accurate local track information, which can be used to detect the group relationship between the image block and the presumed population. At the middle level, the spatial relationship between the image blocks is modeled and studied with the proposed hierarchical tree structure. At a high level, the evolution of the group relation enables the hierarchical tree structure to be dynamically updated in the form of splitting, merging and the like. The experimental results show that the proposed image block detection method provides important auxiliary information for the tracking of a given target, and the proposed dynamic hierarchical tree structure can effectively study the spatial relationship between the objects. The proposed group target tracking algorithm based on the group relationship evolution significantly improves the accuracy of the group target tracking. (2) An image multi-attribute recognition algorithm based on spatial geometric relation is proposed. The algorithm can learn the spatial and semantic relation between the attributes at the same time through a deep-layer convolution neural network which can be "end-to-end"-trained, and only the attribute tag class information of the image is used as the training supervision signal. Specifically, for the input image, an attention map is generated for each possible attribute category label using the proposed "space regular network" (SRN: Spatial Registration Network), and the spatial and semantic relationship between the attributes is simultaneously learned based on the attention map. Finally, the confidence score of each attribute obtained by the "space regular network" is summed with the confidence score obtained by the basic convolution neural network (e.g., residual network ResNet-101), and the attribute confidence score is corrected. The experimental results on a number of different types of open data sets show that the "space regular network" can effectively study the spatial geometric relation between the attributes in the image; this spatial geometry can significantly improve the accuracy of the multi-attribute recognition of the image. and (3) a block-matched row group re-identification algorithm is proposed. in contrast to that problem of the rerecognition of a single-line person, the group re-identification of the line group is faced with more new problems, such as the serious mutual occlusion between the pedestrian in the group, the relative position change of the pedestrian in the group under different visual field, and the like. In order to solve the above problems, this paper puts forward the problem that the group of line groups can be identified and modeled as two groups of image blocks. First, the image block matching with the appearance similarity is not high or the non-discrimination capability is not matched by the proposed saliency channel filtering; then, for the generated candidate matching, the proposed spatial consistency matching is adopted for further screening, and the similarity of the two images is finally obtained. The experimental results show that the proposed algorithm significantly exceeds the current target re-identification algorithm in performance, and the two parts of the proposed algorithm (the significance channel and the spatial consistency match) are mutually reinforcing in the improvement of the group re-recognition performance.
【学位授予单位】:中国科学技术大学
【学位级别】:博士
【学位授予年份】:2017
【分类号】:TP391.41

【相似文献】

相关期刊论文 前10条

1 肖明霞;;基于图像块的人脸检测方法的研究[J];科学时代;2009年02期

2 顾勇;张灿果;龚志广;;基于图像块分割融合算法在医学图像中的应用[J];河北建筑工程学院学报;2007年02期

3 李天伟;黄谦;郭模灿;何四华;;图像块混沌特征在海面运动目标检测中的应用[J];中国造船;2011年02期

4 李军;部分图像块的显示及特技制作技巧[J];电脑编程技巧与维护;1997年04期

5 李生金;蒲宝明;贺宝岳;王维维;;基于图像块的滞留物/移取物的检测方法[J];小型微型计算机系统;2014年01期

6 赵德斌;陈耀强;高文;;基于图像块方向的自适应无失真编码[J];模式识别与人工智能;1998年01期

7 陈琦,李华,朱光喜;一种新的应用于屏幕共享的图像块识别算法[J];电讯技术;2000年06期

8 刘尚翼;霍永津;罗欣荣;白仲亮;魏林锋;项世军;;基于图像块相关性分类的加密域可逆数据隐藏[J];武汉大学学报(理学版);2013年05期

9 陈奋,闫冬梅,赵忠明;一种快速图像块填充算法及其在遥感影像处理中的应用[J];计算机应用;2005年10期

10 马文龙,余宁梅,银磊,高勇;图像块动态划分矢量量化[J];计算机辅助设计与图形学学报;2005年02期

相关会议论文 前2条

1 李赵红;侯建军;宋伟;;基于图像块等级模型的多重认证水印算法[A];第八届全国信息隐藏与多媒体安全学术大会湖南省计算机学会第十一届学术年会论文集[C];2009年

2 钟凡;莫铭臻;秦学英;彭群生;;基于WSSD的不规则图像块快速匹配[A];中国计算机图形学进展2008--第七届中国计算机图形学大会论文集[C];2008年

相关博士学位论文 前7条

1 霍雷刚;图像处理中的块先验理论及应用研究[D];西安电子科技大学;2015年

2 钦夏孟;稠密图像块匹配方法及其应用[D];北京理工大学;2015年

3 林乐平;基于过完备字典的非凸压缩感知理论与方法研究[D];西安电子科技大学;2016年

4 向涛;复杂场景下目标检测算法研究[D];电子科技大学;2016年

5 鲍华;复杂场景下基于局部分块和上下文信息的单视觉目标跟踪[D];中国科学技术大学;2017年

6 朱烽;跨视域摄像头网络下的监控视频结构化与检索[D];中国科学技术大学;2017年

7 宋伟;几类数字图像水印算法的研究[D];北京交通大学;2010年

相关硕士学位论文 前10条

1 王荣丽;基于半监督学习的目标跟踪方法研究[D];浙江师范大学;2015年

2 祝汉城;数字图像的客观质量评价方法研究[D];中国矿业大学;2015年

3 陆杰;使用自组织增量神经网络实现单层非监督特征学习[D];南京大学;2015年

4 熊耀先;基于图像块统计特性的EPLL遥感图像复原方法[D];国防科学技术大学;2014年

5 张书扬;基于冗余字典的图像压缩感知技术研究[D];吉林大学;2016年

6 杨存强;基于图像块多级分类和稀疏表示的超分辨率重建算法研究[D];天津工业大学;2016年

7 李向向;视频监控下实时异常行为检测研究[D];南京邮电大学;2016年

8 程晓东;基于帧间块约束和进化计算的视频压缩感知重构方法[D];西安电子科技大学;2016年

9 李小青;基于脊波冗余字典和多目标遗传优化的压缩感知图像重构[D];西安电子科技大学;2016年

10 文俊;基于深度卷积神经网络的室外场景理解研究[D];杭州电子科技大学;2016年



本文编号:2360371

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/2360371.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户5ec40***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com