在线金融论坛恶意用户群组检测方法及应用
发布时间:2019-06-14 07:03
【摘要】:近年来,互联网的迅猛发展促进了信息技术与网络通信技术的发展。社会生活的高度信息化,使网络承载了蕴含价值的数据,拥有海量用户的社会化网络媒体,已经被组织和个人广泛地用来辅助决策。在线金融论坛上存在巨大的用户群与潜在的商机,使虚假意见和垃圾信息被广泛地制造和传播,该类危害的源头即恶意用户群组。针对以上问题,我们利用网页信息提取、数据存储、情感分析、网络关系建模、重叠社区检测等技术,来采集在线金融论坛用户行为数据、构建用户关系网络、对用户关系网络进行社区划分、检测恶意用户群组并评价检测结果。本文的主要工作如下:1.通过对在线金融论坛网站页面的研究,分析论坛用户行为,利用网页信息抽取技术采集论坛页面信息,匹配实验所需的用户行为数据,并存储到本地关系型数据库My SQL中。2.基于机器学习,对训练集进行分词、特征选取,选择合适的情感分类器,对用户评论内容的情感进行分类预测,依据预测分类结果,构建用户行为网络关系模型,并描述用户相似情感网络的相关全局性统计特征,得出相似情感网络既满足“小世界”特性,也满足无尺度特性。3.考虑到节点属性对数据结构的影响,结合节点拓扑结构和节点属性信息,提出一种基于节点拓扑结构和节点属性的重叠社区检测算法,对在线金融论坛用户关系网络和斯坦福大学的三个社交网络数据集进行重叠社区检测,并与常见的社区检测算法作比较,验证了本文提出算法的可行性与有效性。4.提出相应的社区检测的外部指标,综合这些外部指标检测股票论坛中的恶意用户群组,并结合具体案例分析。
[Abstract]:In recent years, the rapid development of the Internet has promoted the development of information technology and network communication technology. With the high degree of information in social life, the network carries valuable data, and the social network media, which has a large number of users, has been widely used by organizations and individuals to assist decision-making. There are huge user groups and potential business opportunities in online financial forums, so that false opinions and junk information are widely produced and disseminated, and the source of this kind of harm is malicious user groups. In order to solve the above problems, we use web page information extraction, data storage, emotional analysis, network relationship modeling, overlapping community detection and other technologies to collect online financial forum user behavior data, build user relationship network, divide user relationship network into communities, detect malicious user groups and evaluate the detection results. The main work of this paper is as follows: 1. Through the research of the website page of the online financial forum, this paper analyzes the user behavior of the forum, collects the forum page information by using the web page information extraction technology, matches the user behavior data needed in the experiment, and stores it in the local relational database My SQL. 2. Based on machine learning, word segmentation, feature selection, selection of appropriate emotional classifiers, classification and prediction of the emotion of user comment content, according to the prediction classification results, the relationship model of user behavior network is constructed, and the related global statistical characteristics of user similar emotional network are described. it is concluded that the similar emotional network not only satisfies the characteristics of "small world", but also satisfies the characteristics of no scale. Considering the influence of node attributes on data structure, combined with node topology and node attribute information, an overlapping community detection algorithm based on node topology and node attributes is proposed. The overlapping community detection of online financial forum user relationship network and three social network data sets of Stanford University is carried out, and compared with the common community detection algorithms, the feasibility and effectiveness of the proposed algorithm are verified. 4. This paper puts forward the corresponding external indicators of community detection, synthesizes these external indicators to detect malicious user groups in stock forums, and analyzes the specific cases.
【学位授予单位】:南京财经大学
【学位级别】:硕士
【学位授予年份】:2016
【分类号】:TP393.092;TP391.1
本文编号:2499210
[Abstract]:In recent years, the rapid development of the Internet has promoted the development of information technology and network communication technology. With the high degree of information in social life, the network carries valuable data, and the social network media, which has a large number of users, has been widely used by organizations and individuals to assist decision-making. There are huge user groups and potential business opportunities in online financial forums, so that false opinions and junk information are widely produced and disseminated, and the source of this kind of harm is malicious user groups. In order to solve the above problems, we use web page information extraction, data storage, emotional analysis, network relationship modeling, overlapping community detection and other technologies to collect online financial forum user behavior data, build user relationship network, divide user relationship network into communities, detect malicious user groups and evaluate the detection results. The main work of this paper is as follows: 1. Through the research of the website page of the online financial forum, this paper analyzes the user behavior of the forum, collects the forum page information by using the web page information extraction technology, matches the user behavior data needed in the experiment, and stores it in the local relational database My SQL. 2. Based on machine learning, word segmentation, feature selection, selection of appropriate emotional classifiers, classification and prediction of the emotion of user comment content, according to the prediction classification results, the relationship model of user behavior network is constructed, and the related global statistical characteristics of user similar emotional network are described. it is concluded that the similar emotional network not only satisfies the characteristics of "small world", but also satisfies the characteristics of no scale. Considering the influence of node attributes on data structure, combined with node topology and node attribute information, an overlapping community detection algorithm based on node topology and node attributes is proposed. The overlapping community detection of online financial forum user relationship network and three social network data sets of Stanford University is carried out, and compared with the common community detection algorithms, the feasibility and effectiveness of the proposed algorithm are verified. 4. This paper puts forward the corresponding external indicators of community detection, synthesizes these external indicators to detect malicious user groups in stock forums, and analyzes the specific cases.
【学位授予单位】:南京财经大学
【学位级别】:硕士
【学位授予年份】:2016
【分类号】:TP393.092;TP391.1
【参考文献】
相关期刊论文 前4条
1 陈侃;陈亮;朱培栋;熊岳山;;基于交互行为的在线社会网络水军检测方法[J];通信学报;2015年07期
2 郑春东;韩晴;王寒;;网络水军言论如何左右你的购买意愿[J];南开管理评论;2015年01期
3 莫倩;杨珂;;网络水军识别研究[J];软件学报;2014年07期
4 张筱筠;连娜;;网络水军:微博营销中的“灰色阴影”[J];新闻界;2012年01期
,本文编号:2499210
本文链接:https://www.wllwen.com/guanlilunwen/ydhl/2499210.html