当前位置:主页 > 文艺论文 > 广告艺术论文 >

基于聚类分析的微博广告发布者识别

发布时间:2018-10-14 18:40
【摘要】:微博空间存在大量的广告内容,这些信息严重影响着普通用户的用户体验和相关的研究工作。现有研究多使用支持向量机(SVM)或随机森林等分类算法对广告微博进行处理,然而分类方法中人工标注大数据量训练集存在困难,因此提出基于聚类分析的微博广告发布者识别方法:对于用户维度,针对微博广告发布者通过发布大量普通微博来稀释其广告内容的现象,提出核心微博的概念,通过提取核心微博主题及其对应的微博序列,计算用户特征和对应微博的文本特征,并使用聚类算法对特征进行聚类,从而识别微博广告发布者。实验结果显示,所提方法准确率为92%,召回率为97%,F值为95%,证明所提方法在广告内容被人为稀释的情况下能准确地识别微博广告发布者,可以为微博垃圾信息识别、清理等工作提供理论支持和实用方法。
[Abstract]:Weibo space has a large amount of advertising content, which seriously affects the user experience and related research work of ordinary users. In recent studies, support vector machine (SVM) (SVM) or random forest classification algorithms are often used to deal with advertising Weibo. However, it is difficult to manually annotate large amount of data training set in classification methods. Therefore, this paper puts forward a method of identifying Weibo advertisement publishers based on cluster analysis: for the user dimension, aiming at the phenomenon that a large number of ordinary Weibo advertisers dilute their advertising content by publishing a large number of ordinary Weibo, this paper puts forward the concept of the core Weibo. By extracting the core Weibo theme and its corresponding Weibo sequence, the user features and the corresponding text features are calculated, and then the features are clustered by clustering algorithm, so as to identify the advertiser. The experimental results show that the accuracy of the proposed method is 92 and the recall rate is 97 and F is 95. It is proved that the proposed method can accurately identify the advertisement publisher Weibo under the condition that the advertising content is artificially diluted, and can identify the spam information for Weibo. Cleaning work provides theoretical support and practical methods.
【作者单位】: 南京大学软件学院
【基金】:江苏省产学研前瞻性联合研究项目(BY2015069-03)~~
【分类号】:TP391.1


本文编号:2271296

资料下载
论文发表

本文链接:https://www.wllwen.com/wenyilunwen/guanggaoshejilunwen/2271296.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户cffb3***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com