基于核心标签的可重叠微博网络社区划分方法
发布时间:2018-12-26 07:51
【摘要】:针对传统微博社区发现算法内聚低重叠度不可控制等问题,以自顶向下的策略,提出一种基于核心标签的可重叠微博社区发现策略Tag Cut.先利用用户标签的共现关系及逆用户频率对标签进行加权,并基于标签之间的内联及外联关系并将用户的标签进行扩充,然后在整体社区中提取包含某一标签的用户作为临时分组并利用评价函数评估划分的优劣,最后选出最合适的核心标签根据其对应分组与其他分组距离的远近来决定将其划分为新的分组还是并入其他分组.用此策略反复迭代直到满足要求.该算法划分的组由若干个拥有核心标签的分组组成且综合利用微博用户已声明的及隐含的兴趣、用户之间的关注规律、结果的实用性对划分结果进行修正.经真实数据实验表明该方法内聚高社区重叠度可控且拥有实际意义.
[Abstract]:Aiming at the problem of uncontrollable cohesion and low overlap in traditional Weibo community discovery algorithm, this paper proposes an overlapping community discovery strategy Tag Cut. based on core tag based on top-down strategy. The labels are weighted by the co-occurrence and inverse user frequency of the user tags, and the user's tags are expanded based on the inline and outreach relationships between the tags. The users that contain a label are then extracted from the community as a temporary grouping and evaluated by the evaluation function. Finally, the most suitable core label is selected to decide whether to divide it into new groups or to merge them into other groups according to the distance between the corresponding packets and the other groups. Iterate over and over with this strategy until you meet the requirements. The proposed algorithm is composed of several groups with core tags, and uses Weibo's declared and implied interests, the rules of concern among users, and the practicability of the results to modify the partition results. The real data experiments show that the method is controllable and has practical significance.
【作者单位】: 西北师范大学计算机科学与工程学院;中国科学院计算技术研究所智能信息处理重点实验室;北京师范大学信息科学与技术学院;
【基金】:国家自然科学基金(No.61363058,No.61163039) 甘肃省青年科技基金(No.145RJYA259,No.1606RJYA269) 甘肃省自然科学研究基金(No.145RJZA232) 中国科学院计算技术研究所智能信息处理重点实验室开放基金(No.IIP2014-4)
【分类号】:TP393.092
,
本文编号:2391763
[Abstract]:Aiming at the problem of uncontrollable cohesion and low overlap in traditional Weibo community discovery algorithm, this paper proposes an overlapping community discovery strategy Tag Cut. based on core tag based on top-down strategy. The labels are weighted by the co-occurrence and inverse user frequency of the user tags, and the user's tags are expanded based on the inline and outreach relationships between the tags. The users that contain a label are then extracted from the community as a temporary grouping and evaluated by the evaluation function. Finally, the most suitable core label is selected to decide whether to divide it into new groups or to merge them into other groups according to the distance between the corresponding packets and the other groups. Iterate over and over with this strategy until you meet the requirements. The proposed algorithm is composed of several groups with core tags, and uses Weibo's declared and implied interests, the rules of concern among users, and the practicability of the results to modify the partition results. The real data experiments show that the method is controllable and has practical significance.
【作者单位】: 西北师范大学计算机科学与工程学院;中国科学院计算技术研究所智能信息处理重点实验室;北京师范大学信息科学与技术学院;
【基金】:国家自然科学基金(No.61363058,No.61163039) 甘肃省青年科技基金(No.145RJYA259,No.1606RJYA269) 甘肃省自然科学研究基金(No.145RJZA232) 中国科学院计算技术研究所智能信息处理重点实验室开放基金(No.IIP2014-4)
【分类号】:TP393.092
,
本文编号:2391763
本文链接:https://www.wllwen.com/guanlilunwen/ydhl/2391763.html