当前位置:主页 > 科技论文 > 软件论文 >

社交网络下的垃圾信息处理算法研究

发布时间:2018-03-11 18:09

  本文选题:社交网络 切入点:垃圾信息检测 出处:《山东师范大学》2017年硕士论文 论文类型:学位论文


【摘要】:现如今,随着Web2.0的发展,社交网络在人们的生活中扮演了越来越重要的角色,如今主流的社交网络平台有新浪微博、百度知道、微信、QQ、一直播、知乎、豆瓣等,同时,随着手机等通讯工具的普及,为人们随时随地通过网络进行网上阅览、分享信息、互动提供了便利。然而,正是这种便利性,催生了一大批垃圾用户,这些用户在平台上发布恶意链接、推广虚假广告、恣意中伤他人、传播各类谣言等等,严重影响了用户体验,给人们的生活带来了困扰,其消极影响愈加显著。因此,如何识别并检测这些恶意用户、屏蔽垃圾内容,成为当下研究的热点问题。本文选取了时下主流的社交平台——新浪微博,以及知识共享平台——百度知道,利用机器学习技术以及排序思想分别对两个平台的垃圾信息进行处理,设计了针对微博的垃圾信息检测算法以及针对百度知道的隐性垃圾答案沉降算法。本文的主要内容如下:首先,介绍了社交网络的定义发展以及常见的网络中的垃圾信息问题,分别针对微博和问答网站进行了垃圾问题概述,包括垃圾信息的分类、处理技术。其次,针对微博中的垃圾信息,提出了基于颜色的可视化垃圾行为特征提取和基于词项黑名单的垃圾内容特征提取,同时,在这两种特征集合的基础上,提出了基于贝叶斯网络的垃圾信息检测算法。实验证明,基于贝叶斯网络的垃圾信息处理算法分类结果要优于朴素贝叶斯算法,同时优于分别针对垃圾行为和垃圾内容检测的算法。最后,针对问答网站中的垃圾信息,先将垃圾答案分为显性和隐性,对较难用技术手段分类的隐性垃圾答案提出了沉降算法,引用物理学物体下落的思想,结果证明,该算法能够有效地将垃圾答案沉到答案序列的底端。
[Abstract]:Nowadays, with the development of Web2.0, social networks play a more and more important role in people's life. Nowadays, the mainstream social network platforms include Sina Weibo, Baidu knows, WeChat QQQ1, live broadcast, Zhihu, Douban, etc. At the same time, With the popularity of mobile phone and other communication tools, it is convenient for people to read online, share information and interact with each other through the Internet anytime and anywhere. However, it is this kind of convenience that has given birth to a large number of junk users. These users issue malicious links on the platform, promote false advertisements, wanton slander others, spread rumors and so on, seriously affect the user experience, bring troubles to people's lives, and its negative effects become more and more significant. How to identify and detect these malicious users and block spam content has become a hot topic of current research. This paper selects Weibo, a popular social platform, and Baidu, a knowledge-sharing platform, to know. Using machine learning technology and sorting thought to deal with the garbage information of the two platforms, The main contents of this paper are as follows: firstly, the definition and development of social network and the common problems of garbage information in the network are introduced. For Weibo and Q & A websites, respectively, the garbage problem was summarized, including the classification and processing techniques of garbage information. Secondly, aiming at the garbage information in Weibo, In this paper, color based visual garbage behavior feature extraction and word term blacklist based garbage content feature extraction are proposed. At the same time, on the basis of these two feature sets, The algorithm of spam detection based on Bayesian network is proposed. The experimental results show that the classification result of the algorithm based on Bayesian network is better than that of naive Bayesian algorithm. At the same time, it is better than the algorithms for spam behavior and spam content detection, respectively. Finally, according to the spam information in the question and answer website, the garbage answers are classified as explicit and implicit first. A settlement algorithm is proposed for recessive garbage answers which are difficult to classify by technical means, and the idea of physical objects falling is cited. The results show that the algorithm can effectively sink the garbage answers to the bottom of the answer sequence.
【学位授予单位】:山东师范大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP393.092;TP301.6

【相似文献】

相关期刊论文 前10条

1 Bruce Antelman;李雯;;社交网络[J];高校图书馆工作;2008年01期

2 ;基于位置的手机社交网络“贝多”正式发布[J];中国新通信;2008年06期

3 曹增辉;;社交网络更偏向于用户工具[J];信息网络;2009年11期

4 ;美国:印刷企业青睐社交网络营销新方式[J];中国包装工业;2010年Z1期

5 李智惠;柳承烨;;韩国移动社交网络服务的类型分析与促进方案[J];现代传播(中国传媒大学学报);2010年08期

6 贾富;;改变一切的社交网络[J];互联网天地;2011年04期

7 谭拯;;社交网络:连接与发现[J];广东通信技术;2011年07期

8 陈一舟;;社交网络的发展趋势[J];传媒;2011年12期

9 殷乐;;全球社交网络新态势及文化影响[J];新闻与写作;2012年01期

10 许丽;;社交网络:孤独年代的集体狂欢[J];上海信息化;2012年09期

相关会议论文 前10条

1 赵云龙;李艳兵;;社交网络用户的人格预测与关系强度研究[A];第七届(2012)中国管理学年会商务智能分会场论文集(选编)[C];2012年

2 宫广宇;李开军;;对社交网络中信息传播的分析和思考——以人人网为例[A];首届华中地区新闻与传播学科研究生学术论坛获奖论文[C];2010年

3 杨子鹏;乔丽娟;王梦思;杨雪迎;孟子冰;张禹;;社交网络与大学生焦虑缓解[A];心理学与创新能力提升——第十六届全国心理学学术会议论文集[C];2013年

4 毕雪梅;;体育虚拟社区中的体育社交网络解析[A];第九届全国体育科学大会论文摘要汇编(4)[C];2011年

5 杜p,

本文编号:1599309


资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/1599309.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户2bd73***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com