社会化多媒体内容分析与摘要

发布时间：2018-05-09 06:54

本文选题：社会化多媒体 + 内容分析　；参考：《清华大学》2013年博士论文

【摘要】：多媒体信息语义理解是计算机科学与技术学科的经典问题之一，传统的多媒体内容分析与摘要技术关注纯内容分析，数据主要来源于专业网站，质量普遍较高；社会网络的兴起使得社会化多媒体内容主要由用户生成，具有规模大、质量低、社会化和个性化需求高等特点，用户无法高效的从海量数据中提取感兴趣的信息，因此社会化多媒体的自动语义分析与摘要技术愈发重要。社会化多媒体分析与摘要面临以下挑战，首先，用户生成内容（User Generated Content，UGC）使得多媒体信息数量激增，如何快速有效的组织和表示数据成为一大难题；其次，离散的信息发布和传播模式使得从全局角度发现不同粒度的热门话题成为自动搜索引擎的一大障碍；最后，快餐式的信息消费理念使得长篇大论更易受忽视，，如何从海量数据中发现个性化的最有价值的信息是每个互联网用户正在经历的困境。为解决以上挑战，本文分别展开针对性的研究，研究内容主要包括： 1.提出基于以数据驱动方式比较图像特征的特征间优势互补的融合算法。图像的特征表示是图像语义理解的基础，为解决数据组织和表示问题，本文选取当前最有代表性的几种图像特征，在大规模数据集上进行了比较和细致分析，得到了一系列图像特征提取的观察数据，并在此基础上以不同特征之间优势互补的思路设计了特征融合算法，实验表明，该融合算法可以显著好于单一特征。 2.提出社会化多媒体数据的双向语义关联模型。社会网络中的多媒体信息关联关系是复杂多样的，为了更好的拟合数据分布和解决热点话题发现问题，本文提出一种针对多模态微博数据的双向语义关联模型，该模型可以灵活的适应数据中的多样的多媒体多模态信息关联，实验表明该模型在相关的应用中具有良好表现。 3.提出社会属性感知的视频摘要方法。社会网络中的视频摘要相比传统的电影或者体育视频摘要具有内容不确定性和用户个性化需求高的新要求，本文提出一种结合内容重要性和个性化兴趣的社会属性感知的视频摘要方法，模型输出为一个视频故事板，包含用户感兴趣的个性化视频信息，实验表明算法既能捕捉到视频的重要内容又能满足用户个性化兴趣需求。
[Abstract]:Semantic understanding of multimedia information is one of the classic problems in computer science and technology. Traditional multimedia content analysis and abstract technology focus on pure content analysis. With the rise of social network, social multimedia content is mainly generated by users, which has the characteristics of large scale, low quality, high demand for socialization and personalization, and users can not efficiently extract information of interest from mass data. Therefore, the automatic semantic analysis and abstract technology of socialized multimedia is becoming more and more important. The challenges of socialized multimedia analysis and summary are as follows: first, user generated content (user Generated) makes the amount of multimedia information surge, and how to organize and represent data quickly and effectively becomes a big problem. The discrete mode of information distribution and dissemination makes finding hot topics of different granularity from a global perspective a big obstacle to automatic search engines. Finally, the idea of fast food consumption makes it easier to ignore the long talk. How to find the most valuable information from mass data is the dilemma that every Internet user is experiencing. In order to solve the above challenges, this paper respectively launched targeted research, the main contents of the research include: 1. This paper presents a fusion algorithm based on the complementary advantages of features compared with image features in a data-driven manner. Image feature representation is the basis of image semantic understanding. In order to solve the problem of data organization and representation, this paper selects the most representative image features and makes a comparison and detailed analysis on large-scale data sets. A series of observation data of image feature extraction are obtained, and a feature fusion algorithm is designed based on the idea of complementary advantages among different features. Experiments show that this fusion algorithm is significantly better than a single feature. 2. A bidirectional semantic association model of socialized multimedia data is proposed. The relationship between multimedia information in social networks is complex and diverse. In order to better fit the data distribution and solve the hot topic discovery problem, this paper proposes a bi-directional semantic association model for multi-modal Weibo data. The model can flexibly adapt to the multi-modal information association of multimedia in the data. Experiments show that the model has good performance in related applications. 3. A video summarization method for social attribute awareness is proposed. Compared with the traditional film or sports video summary, the video summary in social network has the new requirements of high content uncertainty and user personalized demand. This paper presents a video summarization method which combines the social attribute perception of content importance and personalized interest. The model is outputted as a video storyboard containing personalized video information of interest to users. Experiments show that the algorithm can not only capture the important content of video but also meet the needs of users' personalized interest.
【学位授予单位】：清华大学
【学位级别】：博士
【学位授予年份】：2013
【分类号】：TP391.41

【参考文献】