基于领域的微博用户影响力评估方法的研究
发布时间:2018-06-04 12:43
本文选题:微博 + 领域分类 ; 参考:《西南大学》2014年硕士论文
【摘要】:微博诞生以来,以其交互性强、传播速度快、内容简洁等特征获得了大量网络用户的青睐,是当下流行的社交网络。作为广泛应用的信息载体和传输媒介,微博已经拥有了大量的流动信息和活跃用户。其中用户所发表的内容众多且涉及多个行业与领域,并通过大量的粉丝进行评论与转发,从而在各行业产生巨大的影响力。当合理有效地评估微博用户影响力时,则可以使其产生巨大的社会效益,比如进行信息扩散、商品推介和宣传时会达到事半功倍的效果,这对于商业营销来说具有重大的意义。因此,多方位完善地考虑用户在各领域行业的参与度,计算用户在各领域的影响力具有重要的研究意义。目前国内外也有大量的研究者对微博用户影响力进行了研究。 微博兴起于国外的Twitter,但Twitter又不同于国内的微博,它没有评论功能。因此传统的微博用户影响力评估方法主要是针对于Twitter,虽然考虑了微博用户的粉丝数、微博数、粉丝质量及其转发数与被提及数等参数,但没有考虑微博的评论功能,存在一定的局限性。通常所说的社会影响力是在特定领域的影响力,每个用户在各个领域的影响力是不同的,因此对用户在各领域的影响力评估也具有重大的意义。而传统研究主要是笼统地对用户进行影响力评估,忽略了微博用户的跨领域性与微博的领域交叉性,没有考虑微博用户在不同领域影响力的评估。 因此,针对以上问题,本文提出了基于领域的微博用户影响力的评估方法,该评估方法主要由基于KNN的领域分类算法与微博用户影响力算法构成,解决当前微博用户影响力评估方面存在的问题。本文主要工作和创新点从以下几个方面展开: 第一,针对传统研究忽略了微博用户跨领域以及微博交叉性问题,本文应用了基于KNN的领域分类算法。首先由于一个用户通常对多个领域都有所涉猎,因此其发表的微博将涉及不同的领域。其次单条微博所属的领域界限不明显,可能既属于领域A,也属于领域B。以上现象分别为微博用户的跨领域性与领域交叉性问题。为了充分考虑以上问题,本文应用了基于KNN领域分类算法。该算法主要参照微博文本语料库的类标签,依据每条微博文本内容将微博划分为21个领域,从而得到用户在各领域的微博以及微博总数。 第二,针对传统研究影响力指标过于简单的问题,本文增加了影响力参数计算指标,提出微博用户影响力计算算法。传统研究主要是从微博数、粉丝数、转发数以及被提及数来度量微博用户影响力。微博用户影响力本质上是用户间的相互作用。而用户间的相互作用除了通过传统参数反映外,还能够通过用户的被评论数、总在线时间与注册时间反映。因此本文充分考虑用户的评论功能、在线时间、注册时间等参数,从而提出微博用户影响力计算算法。 第三,进行实验分析。分别运用传统方法与本文提出的评估方法计算微博用户在各领域的影响力,并对该两组数据进行对比与分析。通过实验表明,本文提出的基于领域的微博用户影响力评估方法具有更好的实用性与合理性。 本文的研究能够有效地评估用户在各领域的影响力,对商业宣传具有积极的作用,对微博的应用发展具有重要的意义。
[Abstract]:Since the birth of micro-blog, with its strong interaction, fast transmission speed, simple content and so on, it has been popular with a large number of Internet users. It is the popular social network. As a widely used information carrier and transmission medium, micro-blog has already had a large number of mobile information and active users. Industry and field, and through a large number of fans to review and forward, and thus have great influence in various industries. When a reasonable and effective assessment of the influence of micro-blog users, it can produce huge social benefits, such as information diffusion, commodity introduction and dissemination will achieve twice the result of half the effort, this is a business camp. Marketing is of great significance. Therefore, it is of great significance to consider the participation of users in various fields and to calculate the influence of users in various fields. There are also a large number of researchers at home and abroad studying the influence of micro-blog users.
Micro-blog has sprang up in foreign Twitter, but Twitter is different from domestic micro-blog. It has no comment function. Therefore, the traditional micro-blog user influence evaluation method is mainly aimed at Twitter, although it takes into account the parameters of the number of fans, the number of micro-blog, the quality of the fans, the number of fans, the number of forwarded and the number of references, but does not consider the comments of micro-blog. There are certain limitations. Generally speaking, the influence of the social influence is in a particular field, and the influence of each user in various fields is different. Therefore, it is of great significance to evaluate the influence of the users in various fields. The traditional research is mainly to evaluate the influence of the users in general and ignore the micro-blog. The cross domain of micro-blog and its interdisciplinary nature do not take into account the evaluation of micro-blog users' influence in different fields.
Therefore, in view of the above problems, this paper proposes a field based evaluation method of micro-blog user influence, which is mainly composed of KNN based domain classification algorithm and micro-blog user influence algorithm, to solve the existing problems in the evaluation of influence of micro-blog users. The main work and innovation points are shown in the following aspects. Open:
First, in view of the neglect of the cross domain and the micro-blog crossover problem of micro-blog users, the domain classification algorithm based on KNN is applied in this paper. First, because one user usually dabble in many fields, the published micro-blog will involve different fields. Secondly, the domain boundaries of the single micro-blog are not obvious, which may be both possible. The domain A, which belongs to the domain B., is the cross domain and domain cross problem of the micro-blog users respectively. In order to fully consider the above problems, this paper applies the KNN domain classification algorithm. The algorithm mainly refers to the class tag of the micro-blog text corpus, and divides the micro-blog into 21 domains according to each micro-blog text content. Get the total number of micro-blog and micro-blog in all fields.
Second, in order to solve the problem that the traditional research influence index is too simple, this paper adds the calculation index of the influence parameter and puts forward the micro-blog user influence calculation algorithm. The traditional research is mainly from the micro-blog number, the number of fans, the forwarding number and the number of references to measure the influence force of the micro-blog user. The influence of the micro-blog user is essentially the interaction between the users. The interaction between users is not only reflected by the traditional parameters, but also can be reflected by the number of users' comments, the total online time and the time of registration. Therefore, this paper gives full consideration to the user's comment function, online time, registration time and other parameters, thus the micro-blog user influence calculation algorithm is proposed.
Third, carry on the experiment analysis. Use the traditional method and the evaluation method proposed in this paper to calculate the influence of micro-blog users in various fields, and compare and analyze the two groups of data. Through the experiment, it shows that the domain based micro-blog user influence evaluation method proposed in this paper has better practicability and rationality.
The research in this paper can effectively evaluate the influence of users in various fields, play a positive role in business propaganda, and is of great significance to the application and development of micro-blog.
【学位授予单位】:西南大学
【学位级别】:硕士
【学位授予年份】:2014
【分类号】:TP393.092
【参考文献】
相关期刊论文 前10条
1 张征杰;王自强;;文本分类及算法综述[J];电脑知识与技术;2012年04期
2 刘群,张华平,俞鸿魁,程学旗;基于层叠隐马模型的汉语词法分析[J];计算机研究与发展;2004年08期
3 李荣陆,王建会,陈晓云,陶晓鹏,胡运发;使用最大熵模型进行中文文本分类[J];计算机研究与发展;2005年01期
4 张著英;黄玉龙;王翰虎;;一个高效的KNN分类算法[J];计算机科学;2008年03期
5 吴春明;谢德体;;基于领域特征文本的Deep Web分类研究[J];计算机科学;2012年04期
6 张宁,贾自艳,史忠植;使用KNN算法的文本分类[J];计算机工程;2005年08期
7 罗长升;段建国;郭莉;;基于推拉策略的文本分类增量学习研究[J];中文信息学报;2008年01期
8 张孝飞;黄河燕;;一种采用聚类技术改进的KNN文本分类方法[J];模式识别与人工智能;2009年06期
9 吴文苑;;微博传播对网络舆论的影响——以“宜黄强拆事件”为例[J];新闻世界;2011年06期
10 侯汉清;;分类法的发展趋势简论[J];情报科学;1981年01期
,本文编号:1977395
本文链接:https://www.wllwen.com/guanlilunwen/ydhl/1977395.html