基于镜头及场景上下文的短视频标注方法研究
[Abstract]:With the rapid development of digital media technology, communication technology and network technology, the number of digital media information represented by video is expanding rapidly. Short video is a kind of video data with a lot of content. How to find effective information in a large amount of short video data has always been a problem of concern to users, resulting in video indexing, video retrieval and other related applications. Video tagging is the core of these applications. At present, video tagging has become a hot research topic in the field of digital media applications and computer vision. From the semantic point of view, video can be divided into several semantic units. Different semantic units have different semantic connotations and can realize semantic annotation at each semantic level. Based on the in-depth analysis of the video structure, the video segment is segmented to form different semantic units, and the short video is annotated in the shot semantic layer and scene semantic layer. The main achievements and innovations of this paper are as follows: (1) combining the global and local features of video frames, a novel shot edge detection method combining video dynamic texture and SIFT features is proposed. In this method, two adjacent frames are partitioned evenly, and the average gradient of each image block in the frame is calculated in RGB color space. The video dynamic texture is formed by the average gradient of all image blocks. The dynamic texture of adjacent frames is compared and the shot change is judged by matching the SIFT features of adjacent frames. This algorithm can detect the shot edge of different types of video data with high accuracy. (2) A video semantic annotation model based on shot events is proposed. Based on the analysis of the video structure, the background color features of the moving object and the key frame of the shot are extracted to express the event of a shot, which extends to the expression of the scene event. Ultimately, the collection of all events is the subject of a video clip. The model takes the event group composed of the shot moving object and the environment background as the annotation result. The annotation model represents the semantic connotation of shot and improves the accuracy of video semantic expression. (3) A new method of video annotation based on semi-supervised clustering is proposed. In the unit of shot event, the video is annotated with event group. In order to reduce the dependence of video tagging on labeled samples, semi-supervised K-means clustering algorithm is constructed by semi-supervised learning idea, and the objective function is optimized, so that the final clustering results can not only reflect the low coupling between classes and high aggregation within classes. It also reflects the local data distribution density in the class. This algorithm implements the clustering of multi-attribute heterogeneous data such as video, and improves the accuracy of video tagging. (4) A new context-based multi-core learning video classification method is proposed. Based on the traditional word bag model, a video scene classification model is proposed according to the correlation between the adjacent shot key frames. Firstly, the video segment is segmented, the key frame is extracted, and the key frame image is normalized. Then the key frame image is used as the image block to synthesize the new image with temporal relation, and the SIFT feature and HSV color feature of the new image are extracted, and the SIFT feature and HSV color feature data of the image are mapped to Hilbert space. Through multi-kernel learning, the appropriate kernel function groups are selected to train each image, and finally the classification model is obtained, and a better classification effect is obtained. These research results can be widely used in many fields such as video classification, video indexing, video retrieval, video content understanding, video data management and so on, which have important theoretical significance and high application value.
【学位授予单位】:上海大学
【学位级别】:博士
【学位授予年份】:2016
【分类号】:TP391.41
【相似文献】
相关期刊论文 前10条
1 陆懿;陈光梦;毕宏杰;董栋;;改进的自然动态纹理综合算法[J];计算机工程与设计;2008年14期
2 姚伟光;王赢;许存禄;;将局部二进制模式应用于动态纹理识别的新方法[J];微计算机信息;2010年09期
3 陈昌红;赵恒;胡海虹;梁继民;;基于改进动态纹理模型的人体运动分析[J];模式识别与人工智能;2010年02期
4 陈青;朱俊宇;唐朝晖;刘金平;桂卫华;;动态纹理建模在硫浮选工况的识别分析[J];计算机与应用化学;2013年10期
5 邵婧;王冠香;郭蔚;;基于视频动态纹理的火灾检测[J];中国图象图形学报;2013年06期
6 陈红倩;陈谊;曹健;刘鹂;;基于动态纹理技术的实时森林绘制[J];计算机仿真;2012年06期
7 何莎;费树岷;;动态纹理背景的建模[J];计算机应用;2009年S2期
8 邹运兰;王仁芳;;基于多重纹理和动态纹理技术的实时水面模拟[J];浙江万里学院学报;2010年06期
9 陈红倩;李凤霞;黄天羽;战守义;;一种基于动态纹理的运动场景可视化方法[J];北京理工大学学报;2009年06期
10 于鑫;韩勇;陈戈;;基于动态纹理和粒子系统的火焰效果模拟[J];信息与电脑(理论版);2009年11期
相关会议论文 前1条
1 陆懿;陈光梦;;一种改进的彩色动态纹理综合算法[A];中国仪器仪表学会第九届青年学术会议论文集[C];2007年
相关博士学位论文 前3条
1 王勇;基于混沌特征向量的动态纹理识别[D];上海交通大学;2014年
2 彭太乐;基于镜头及场景上下文的短视频标注方法研究[D];上海大学;2016年
3 周丙寅;张量分解及其在动态纹理中的应用[D];河北师范大学;2012年
相关硕士学位论文 前10条
1 陆懿;一种改进的基于非线性模型的动态纹理识别算法[D];复旦大学;2008年
2 徐磊磊;动态纹理性质及其模拟算法研究[D];华中科技大学;2007年
3 姚伟光;基于局部二进制运动模式的动态纹理描述新方法[D];兰州大学;2009年
4 周文玲;增强现实中动态纹理的识别与重建技术研究[D];华东师范大学;2011年
5 刘霞;自然景物模拟的动态纹理研究与实现[D];国防科学技术大学;2005年
6 丁悦;基于数据驱动的马尔柯夫链蒙特卡洛模型的动态纹理分析[D];南京理工大学;2007年
7 曹寿刚;基于李群论和动态纹理的视频分类技术研究[D];华中科技大学;2013年
8 高平;基于扩展统计地形特征的动态纹理识别研究[D];兰州大学;2009年
9 施濵;基于时空方向能量的动态纹理研究[D];上海交通大学;2012年
10 张茜;基于动态纹理的流水效果合成技术研究[D];山东大学;2006年
,本文编号:2174386
本文链接:https://www.wllwen.com/shoufeilunwen/xxkjbs/2174386.html