一种基于逻辑回归的微博用户可信度评估方法

发布时间：2018-08-24 16:07

【摘要】：微博用户的可信度研究已逐步成为当前微博研究的热点之一,其目的是对微博用户的身份类别进行一个客观、合理的评价,有效鉴别微博中的虚假用户。然而现有的鉴别方法大多停留在对传统的虚假用户“僵尸粉”进行鉴别,其方法简单,功能单一,对新型智能虚假用户的区分能力较弱,不能合理地对微博用户的身份进行评价。逻辑回归是一种可以用来分类的常用统计分析方法,可以得到概率型的预测结果,适用于对微博用户的身份类别进行预测。本文针对现有鉴别方法的不足,首先对微博虚假用户的行为特征进行分析,从在线时长、发帖时间、使用微博的动力、微博来源以及互动行为几个方面,对微博用户的固有特征进行合理的逻辑组合,提取了用于区分用户类别的特征变量,而后运用逻辑回归算法,提出了一种基于逻辑回归的微博用户可信度评价模型WUREM。实验以新浪微博为研究平台,验证了模型的有效性和合理性。结果表明,本文所提模型可以根据用户置信值CM的大小对其身份进行一个较为客观、合理的评价和分类,解决了传统方法直接二分类用户类别带来的局限性和不合理性,不仅能对传统的低级虚假用户“僵尸粉”进行识别,而且对新型智能虚假用户也有较高的识别率。同时,本文成果还可为微博用户影响力、微博信息可信度、微博热度等方面的研究提供必要参考。
[Abstract]:The research of Weibo user's credibility has gradually become one of the hotspots in the current research of Weibo. Its purpose is to evaluate the user's identity category objectively and reasonably, and to identify the false user effectively. However, most of the existing methods remain in the identification of the traditional false user "zombie powder". The method is simple, the function is single, and the ability to distinguish the new intelligent fake user is weak. Can't be reasonable to Weibo user's identity carries on the appraisal. Logical regression is a commonly used statistical analysis method which can be used to classify. It can obtain probabilistic prediction results and can be used to predict Weibo user's identity category. Aiming at the deficiency of the existing identification methods, this paper first analyzes the behavior characteristics of Weibo's false users, from the aspects of online time, posting time, the motive force of the use of Weibo, the origin of Weibo and the interactive behavior. The logical combination of the inherent characteristics of Weibo users is carried out, and the feature variables used to distinguish the user categories are extracted. Then, by using the logical regression algorithm, a user reliability evaluation model WUREM. based on logical regression is proposed. The experiment takes Sina Weibo as the research platform to verify the validity and rationality of the model. The results show that the proposed model can evaluate and classify the user's identity objectively and reasonably according to the size of user confidence value (CM), which resolves the limitation and irrationality brought by the traditional method. It not only can identify the traditional low-level false user "zombie powder", but also has a high recognition rate for the new intelligent fake user. At the same time, the results of this paper can also provide necessary reference for Weibo's user influence, the information credibility of Weibo and the fever of Weibo.
【学位授予单位】：河北大学
【学位级别】：硕士
【学位授予年份】：2015
【分类号】：TP393.092

【参考文献】