社会化网络用户关系强度计算模型研究
发布时间:2018-02-13 00:18
本文关键词: 社会化网络 用户关系强度 活动领域分类 直接关系 间接关系 出处:《浙江工商大学》2017年硕士论文 论文类型:学位论文
【摘要】:随着互联网的发展,社会化网络迅速流行,并在人们的日常生活中发挥着至关重要的作用,为信息传播、经验分享、生活交流等活动开拓了重要渠道。也因此,社会化网络中用户间的关系强度引起了研究者的高度重视。社会化网络在多个领域的重要作用逐渐凸显,如应用于好友推荐、商品推荐、链路预测等。就个性化服务推荐而言,社会化网络用户间的关系强度是进行推荐的重要依据。目标推荐用户的喜好往往跟与他有较强关系强度的用户更接近,而来自具有亲密关系的人的推荐也往往是更容易被接受的。因此,社会化网络中用户间关系强度的重要性不言而喻。但目前已有的关系强度计算方法考虑都较为片面,许多研究只是笼统地对社会化网络中用户间的关系强度进行计算,而未针对特定的情况进行研究,并且许多研究只针对社会化网络中用户间存在的直接关系进行研究,而忽略具有举足轻重的间接关系,因此计算结果的精确度有待提高。基于以上问题,本文提出了一种基于活动领域分类与间接关系融合的社会化网络用户关系强度计算模型,主要研究内容主要包括以下几个方面:第一,通过爬虫获取社会化网络中的相关数据,对数据进行预处理(包括中文分词、去停用词),转化为相应的文档数据集,去除垃圾数据,有助于计算结果准确性的提高。第二,对社会化网络中用户群的交互活动进行活动领域分类。用LDA算法对用户交互活动文档进行集群,利用标准化谷歌距离将结果集群与活动领域名称(工作、饮食、购物、旅游、运动、娱乐)进行相关度计算,确定每个结果集群所属的活动领域。之后再进一步通过相关度的计算判断每个交互活动文档所属的活动领域。结合活动领域分类对社会化网络中用户间的关系强度进行计算有助于该研究成果后续能更有针对性地应用于其他领域,如应用于个性化推荐时,可以分领域进行推荐,提高推荐的成功率。第三,直接关系强度计算中充分考虑多种影响因素。结合个体相似性、时间性、互动性对每个交互活动领域内用户间的直接关系强度进行计算,充分考虑了多方面的关系强度影响因素,有利于直接关系强度的准确计算。第四,融合间接关系于关系强度计算过程中。考虑到间接关系在社会化网络关系网中具有举足轻重的地位,在最终关系强度的计算中融合了间接关系,不仅解决了不存在直接关系而只存在间接关系的用户间关系强度无法计算的问题,而且提高了关系强度计算的准确性。第五,提出了衡量关系强度计算结果准确性的评价指标。分别与基于文档级别、集群级别、微博会话的活动领域分类方法比较,评价本文所提出的活动领域分类方法的效率。并根据准确率、召回率和标准衡量搜索引擎质量指标(NDCG)作为实验结果的评价指标,将本文所提出的关系强度计算方法分别与线性组合方法、通用框架模型方法比较,实验结果表明本文所提的基于活动领域分类与间接关系融合的社会化网络用户关系强度计算方法更优。
[Abstract]:With the development of the Internet, the rapid popular social network, and in people's daily life plays a vital role in sharing the experience for the dissemination of information, communication and other activities, life opens up an important channel. Therefore, the strength of relationship between users in social network have attracted the attention of researchers in social networking. An important role in many fields gradually highlighted, as for friend recommendation, recommendation, link prediction. The personalized service recommendation, the strength of the relationship between social network users is an important basis for the recommended target. Recommended user preferences tend to have strong relationship with his strength and is closer to the user. From people with intimate relations recommended also tend to be more easily accepted. Therefore, it is self-evident importance of relationship strength between users in social network. But the existing intensity meter Considering the calculation method are relatively one-sided, many studies only loosely on social relationship strength between users in the network are calculated, but not for the specific case study, and many studies are focused on the direct relationship between social network among users of neglect has indirect relationship important, therefore the calculation results the accuracy needs to be improved. Based on the above problems, this paper presents a computational model of social network user relationship strength fusion classification based on field of activity and the indirect relation, the main research contents include the following aspects: first, access to relevant data in social network by crawler, data preprocessing (including Chinese segmentation, go to stop words), into the corresponding document data set, remove garbage data, help to improve the accuracy of the calculation results. Second, on the social network Interactive activities in the user group of activities in the field of classification. Cluster of user interaction activities document with the LDA algorithm, using standard Google distance name cluster and field results (work, food, shopping, travel, sports, entertainment) related calculation, determine the result of each cluster belongs to the field of activity. Further through calculation of the correlation judgment of each document are interactive activities activities. Combining activities in the field of classification of relationship strength between users in the network of social computing is helpful for the subsequent research results can be more targeted application in other fields, such as for personalized recommendation, can be divided into the field of recommendation and recommended to improve success rate. Third, a variety of factors considered in the calculation of direct relationship strength. Combined with the individual similarity, timeliness, interaction of each interaction in the field of use Calculate direct relationship strength between households, fully considering the factors of relationship strength to many factors, there are conducive to the accurate calculation of direct relationship strength. In fourth, the indirect relationship between fusion relationship strength calculation process. Considering the indirect relationship plays an important role in the social relationship network, in the calculation of the final strength of the relationship fusion of indirect relationship, not only solved there is no direct relationship between strength can not be calculated only indirect relation between users, but also improve the accuracy of relationship strength calculation. Fifth, put forward the measure of relationship strength calculation accuracy evaluation index. And based on the document level, cluster level, micro-blog session activities the field classification method, efficiency evaluation of this field of activity. According to the classification accuracy rate, recall rate and standard search engine The quality index (NDCG) was used as the index to evaluate the experimental results, the method with linear combination method of strength calculation of the relationship will be presented in this paper, the comparative method of general framework model, the experimental results show that the social network user relationship strength fusion classification based on field of activity and the indirect relationship between the proposed calculation method is better.
【学位授予单位】:浙江工商大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP393.09
【参考文献】
相关期刊论文 前6条
1 琚春华;陶婉琼;许厘,
本文编号:1506894
本文链接:https://www.wllwen.com/kejilunwen/sousuoyinqinglunwen/1506894.html