社会媒体中的用户偏好建模研究

发布时间：2018-08-01 14:44

【摘要】：随着Web 2.0的发展,人们已经习惯在网上发表自己的观点及看法,也从别人发布的信息中获取自己所需的信息,从而形成了一个由广大用户主导的互联网模式。在这样的互联网模式下,人们越来越依赖网络,从最初的查找资料,到后来的各种聊天社区,到现在衣食住行等都要到网上看别人的评价才会做出决定,互联网正在改变人们生活的方方面面。而社会媒体正是这些行为的媒介,包括虚拟社区和网络平台等,人们可以在上面创作、分享、交流意见、观点及经验,主要包括微博、博客、论坛、网络社区、评论网站等。人们在社会媒体上发表自己的观点,而个人观点一般是带有情感偏好的,这些观点大致分为2类,一类是文本信息,比如微博的内容等,另一类是打分信息,比如电影的评分等。用户偏好是指用户对于某件事件、物品的喜爱、厌恶等各种情感。用户偏好研究就是通过研究这些蕴含了丰富情感的信息,了解用户想表达的情感偏好。本文将从方面评分的评分预测和唐代诗词的情感分析两个方面来研究社会媒体中的用户偏好问题。方面评分是产品各个细致方面的评分,而总评分是产品所有方面的综合评分。现今,大部分使用总评分的工作都是基于这样一个假设:总评分是方面评分的平均分或总评分与方面评分很接近。然而经过分析真实数据集发现,在总评分和方面评分之间存在一个评分偏差,但现有工作并没有考虑评分偏差。本文首次研究了带有评分偏差的方面评分预测问题,提出了一个新的情感主题混合模型RCMB。RCMB认为总评分是概率图的中心并通过一个隐藏方面评分变量整合了评分偏差先验信息。在真实数据集(大众点评和TripAdvisor)上的实验表明,RCMB比其他现有方法取得了更高的预测准确率,并更能保持评分的相对顺序。现有的情感分析工作一般是关注现代文本,比如产品评论和微博,很少涉及古代文学作品的分析。而诗词则相当于古代人所使用的微博,也是表达其情感的重要媒介。本文提出了一个基于迁移学习的中国唐代诗歌情感分类模型TL-PCO,通过分析诗歌的情感可以了解到当时的社会和文化进展。TL-PCO通过两个迁移学习函数得到两种特征,再加上古代诗歌本身的特征,建立3个分类器并投票得出最后的结果。在中国唐诗上的实验表明了方法的有效性,并详细分析了唐代各个时期以及重要流派的情感,结合社会历史的分析,取得了良好的效果。
[Abstract]:With the development of Web 2.0, people have been used to express their views and opinions on the Internet, and to obtain the information they need from the information published by others, thus forming an Internet model dominated by the vast number of users. Under this kind of Internet model, people rely more and more on the Internet. From the initial search for information to the various chat communities later, to the present, they have to go to the Internet to see other people's comments before they can make a decision. The Internet is changing every aspect of people's lives. Social media is the medium of these behaviors, including virtual communities and web platforms, where people can create, share, exchange opinions, ideas and experiences, including Weibo, blogs, forums, online communities, comment sites and so on. People express their views on social media, and personal views tend to have emotional preferences, which fall into two categories: text information, such as the content of Weibo, and scoring information. Such as the film score, and so on. User preference refers to the user's affection for an event, object, disgust and so on. The research of user preference is to understand the emotional preference that users want to express by studying the rich emotional information. This paper will study the problem of user preference in social media from two aspects: score prediction of aspect score and emotional analysis of Tang poetry. The aspect score is the score of all aspects of the product, while the total score is the comprehensive score of all aspects of the product. Nowadays, most of the work using the total score is based on the assumption that the total score is the average score of the aspect score or the total score is very close to the aspect score. However, after analyzing the real data set, it is found that there is a score deviation between the total score and the aspect score, but the existing work does not consider the score deviation. In this paper, the problem of aspect score prediction with score deviation is studied for the first time, and a new affective subject hybrid model, RCMB.RCMB, is proposed, which considers that the total score is the center of the probability graph and integrates the prior information of the score deviation through a hidden aspect scoring variable. Experiments on real data sets (Dianping and TripAdvisor) show that RCMB has higher prediction accuracy than other existing methods and is more able to maintain the relative order of scores. Current affective analysis works generally focus on modern texts, such as product reviews and Weibo, with little reference to the analysis of ancient literature. Poetry is the same as Weibo used by ancient people, and it is also an important medium to express their feelings. This paper presents an emotional classification model of Chinese Tang poetry based on transfer learning, TL-PCO. by analyzing the emotion of poetry, we can understand the social and cultural progress at that time. TL-PCO obtains two characteristics through two transfer learning functions. Combined with the characteristics of ancient poetry, three classifiers were established and voted for the final result. The experiment in Chinese Tang poetry shows the effectiveness of the method, and analyzes the emotion of every period and important schools in the Tang Dynasty in detail, combining with the analysis of social history, it has achieved good results.
【学位授予单位】：北京邮电大学
【学位级别】：硕士
【学位授予年份】：2016
【分类号】：TP391.3

【参考文献】