当前位置:主页 > 管理论文 > 移动网络论文 >

融合多维签到信息的LBSN链接预测研究

发布时间:2018-10-19 13:58
【摘要】:随着移动互联网技术的飞速发展,基于位置的服务不断增加,越来越多的人通过在线社交网络分享带有地理标记的图片、视频以及文本等内容,形成了基于位置的社交网络(Location Based social Network,LBSN)。对社交网络进行数据挖掘又称为链接挖掘。本文研究的LBSN朋友关系链接预测是链接挖掘的一个分支,是当下学者研究的热点。对LBSN提供的大量基于时空维度的签到信息进行挖掘为链接预测研究提供新的方向。然而,LBSN用户的签到分布稀疏,且分析维度单一,对预测性能的改善造成困难。针对以上问题,本文从用户、时间、位置以及位置语义四个维度挖掘签到信息中包含的用户相似性特征,并利用有监督学习的策略综合这些特征进行链接预测。在真实网络数据集中的仿真实验结果表明,本文提出的方法显著提高了链接预测的性能。论文的研究工作得到了国家自然科学基金项目(No.61172072、61271308)、北京市自然科学基金项目(No.4112045)和高等学校博士学科点专项科研基金(No.20100009110002)的支持。论文的主要工作和贡献包括以下几个方面:(1)从用户、位置和时间三个维度来分析LBSN数据集基于签到行为的分布特点。分析可知,LBSN用户的签到分布稀疏,这对充分利用签到信息造成困难。(2)针对签到地点分布稀疏的问题,利用层次聚类算法对签到地点进行聚类,引入广义地点的概念,并由此来构建广义的地点关系网络,从而大大减少网络中的孤立点数目,尽可能的保留网络中的用户。针对用户的签到在时间维度分布稀疏的问题,利用单个用户在不同时刻签到行为的相似性来修正两个用户在不同时刻签到行为的相似性,充分利用签到时间信息。(3)提出UTP模型来挖掘基于时空维度的用户相似性特征,并提出了综合用户和位置的相似性特征和基于签到时间的相似性特征。在真实网络数据集中的验证表明,这两个特征能够有效区分朋友和非朋友关系。(4)从位置语义维度挖掘基于地点语义的用户相似特征。利用LDA文档主题建模思想对所有用户的签到语义POI信息进行位置主题建模,并提出了基于签到地点语义的用户相似性特征。在真实网络数据集中的验证表明,该特征能够有效区分朋友和非朋友关系。(5)融合基于LBSN的网络结构信息、签到地点信息以及地点语义信息得到多维相似性特征向量,并利用有监督的策略来进行链接预测。在真实网络数据集中的实验表明,相较于传统的链接预测算法,本文提出的基于多维信息的链接预测算法显著提高了 LBSN链接预测的性能。
[Abstract]:With the rapid development of mobile Internet technology and the increasing number of location-based services, more and more people share geographically marked pictures, videos and text through online social networks. A location-based social network called (Location Based social Network,LBSN. Social network data mining, also known as link mining. In this paper, LBSN friend link prediction is a branch of link mining, which is a hot research topic. Mining a lot of sign-in information based on time and space dimension provided by LBSN provides a new direction for link prediction. However, the sparse check-in distribution of LBSN users and the single dimension of analysis make it difficult to improve the prediction performance. In order to solve the above problems, the user similarity features contained in the sign-in information are mined from four dimensions: user, time, location and location semantics, and these features are synthesized by supervised learning strategies for link prediction. Simulation results in real network data sets show that the proposed method improves the performance of link prediction significantly. The research work is supported by the National Natural Science Foundation (No.61172072,61271308), the Natural Science Foundation of Beijing (No.4112045) and the Special Research Foundation for doctorate points of higher Education (No.20100009110002). The main work and contributions of this paper are as follows: (1) the distribution characteristics of LBSN data sets based on check-in behavior are analyzed from three dimensions: user, location and time. The analysis shows that the LBSN user's check-in distribution is sparse, which makes it difficult to make full use of the check-in information. (2) aiming at the problem of sparse check-in location distribution, the hierarchical clustering algorithm is used to cluster the check-in location, and the concept of generalized location is introduced. Then the generalized location relationship network is constructed, which greatly reduces the number of outliers in the network and preserves the users in the network as much as possible. Aiming at the sparse distribution of user check-in time dimension, the similarity of check-in behavior of single user at different times is used to correct the similarity of check-in behavior between two users at different times. (3) UTP model is proposed to mine user similarity features based on spatio-temporal dimension, and the similarity features of integrated user and location and check-in time are proposed. Verification in real network data sets shows that the two features can effectively distinguish between friends and non-friends. (4) the location semantic dimension is used to mine the user similarity features based on location semantics. Based on the idea of LDA document topic modeling, the location topic of all users' check-in semantic POI information is modeled, and a user similarity feature based on check-in location semantics is proposed. Verification in real network data sets shows that the feature can effectively distinguish between friends and non-friends. (5) combining network structure information based on LBSN, check-in location information and location semantic information, multi-dimensional similarity feature vector is obtained. A supervised strategy is used for link prediction. Experiments in real network data sets show that the proposed link prediction algorithm based on multidimensional information improves the performance of LBSN link prediction significantly compared with the traditional link prediction algorithm.
【学位授予单位】:北京交通大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP393.09;TP311.13

【参考文献】

相关期刊论文 前8条

1 李宏涛;何克清;王健;彭珍连;田刚;;基于概念格和随机游走的社交网朋友推荐算法[J];四川大学学报(工程科学版);2015年06期

2 王莹;郭宇春;;基于位置的社交网络链接预测特征研究[J];计算机与现代化;2015年04期

3 WANG Peng;XU BaoWen;WU YuRong;ZHOU XiaoYu;;Link prediction in social networks: the state-of-the-art[J];Science China(Information Sciences);2015年01期

4 卢文羊;徐佳一;杨育彬;;基于LDA主题模型的社会网络链接预测[J];山东大学学报(工学版);2014年06期

5 张健沛;姜延良;;一种基于节点相似性的链接预测算法[J];中国科技论文;2013年07期

6 吕琳媛;;复杂网络链路预测[J];电子科技大学学报;2010年05期

7 赵慧;刘希玉;崔海青;;网格聚类算法[J];计算机技术与发展;2010年09期

8 唐华松,姚耀文;数据挖掘中决策树算法的探讨[J];计算机应用研究;2001年08期

相关博士学位论文 前1条

1 蒋良孝;朴素贝叶斯分类器及其改进算法研究[D];中国地质大学;2009年

相关硕士学位论文 前5条

1 吴晓阳;微博用户社会关系离线挖掘算法的研究[D];北京交通大学;2016年

2 王莹;基于位置的社交网络链接预测系统研究[D];北京交通大学;2015年

3 朱荣鑫;基于地理位置的社交网络潜在用户和位置推荐模型研究[D];南京邮电大学;2013年

4 补嘉;基于LDA的社交网络链接预测模型研究[D];西南大学;2012年

5 郭宏伟;基于矩阵的多特征链接预测方法研究[D];燕山大学;2010年



本文编号:2281341

资料下载
论文发表

本文链接:https://www.wllwen.com/guanlilunwen/ydhl/2281341.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户38ef5***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com