社交网络中基于内积空间坐标系的距离预测算法研究
发布时间:2019-07-04 08:25
【摘要】:随着社交软件的普及,与之相关的社交网络也逐渐成为学术界研究的热点。在对社交网络进行拓扑分析时,计算距离(定义为组成点与点之间最短路径的边的条数)是第一步。目前存在一些经典的最短路径距离求解算法,例如广度优先遍历(BFS)、Dijkstra和Floyd等。这些算法时间复杂度较高,适用于普通网络,但是不适用于社交网络这类数据规模较大的网络。基于图形坐标系的距离预测算法是目前社交网络距离预测中比较常用的算法,它通过预处理获得部分真实距离,然后借助坐标系和部分真实距离来对其余的距离信息进行预测,大大减少了计算过程的时间花销。但是现有方法只考虑了将社交网络建模为无向网络的情况,都只适用于无向网络,在有向网络中会产生较大的误差。本文针对现有方法的不足,开展了进一步的研究,提出了基于内积空间坐标系的社交网络距离预测算法。该算法在整个计算流程中使用坐标系,将社交网络嵌入到坐标系中,社交网络中的点与坐标系中的点一一对应,通过坐标唯一标识,那么求社交网络中任意点对间的距离就等价于求坐标系内对应点对间的距离。网络中节点在坐标系中的坐标计算作为整个算法最重要的部分,采用的是基于离散矩阵分解的坐标计算算法。为克服已有方法无法适用于有向网络的不足,本文提出的坐标计算方法采用奇异值矩阵分解和非负矩阵分解两种矩阵分解技术。每个节点在坐标系中由出坐标和入坐标组成的坐标对来唯一标识,计算点对之间的距离时,由起始节点的出坐标和终止节点入坐标的内积计算得来,这样就克服了距离的不对称性约束。为提高已有算法的精确度、缩短运行时间,本文提出的坐标计算方法借鉴了鲁棒性主成分分析降维去噪的思想,从原距离矩阵中还原出低秩的主要部分来消除误差和离群点,达到降维去噪的效果。除此之外,该算法将离散矩阵分解和坐标计算融合成单优化问题,两个过程是同时完成的,在一定程度上减小了误差。本文在真实的社交软件Facebook、Wiki、LiveJournal、Orkut和Gulps等数据集上对算法进行了仿真,包括功能仿真、影响算法因素仿真和扩展应用仿真,并与已有的方法进行了对比。实验结果表明,本文提出的方法与已有方法相比不仅提高了计算结果的精确性,减小了时间开销,也改善了其无法适用于有向图的不足。
[Abstract]:With the popularity of social software, social networks related to it have gradually become the focus of academic research. When analyzing the topology of social networks, calculating the distance (defined as the number of edges that make up the shortest path between points) is the first step. At present, there are some classical shortest path distance solving algorithms, such as breadth first traversing (BFS), Dijkstra and Floyd. These algorithms have high time complexity and are suitable for ordinary networks, but not for networks with large data scale such as social networks. The distance prediction algorithm based on graphic coordinate system is a common algorithm in the distance prediction of social networks at present. It obtains part of the real distance through preprocessing, and then forecasts the other distance information with the help of coordinate system and part of the real distance, which greatly reduces the time cost of the calculation process. However, the existing methods only consider the modeling of social networks as undirected networks, which are only suitable for undirected networks, and there will be large errors in directed networks. In this paper, aiming at the shortcomings of the existing methods, further research is carried out, and a distance prediction algorithm for social networks based on inner product spatial coordinate system is proposed. In this algorithm, the coordinate system is used in the whole calculation process, and the social network is embedded in the coordinate system. The points in the social network correspond to the points in the coordinate system one by one. Through the unique identification of the coordinates, the distance between any point pairs in the social network is equivalent to the distance between the corresponding points in the coordinate system. As the most important part of the whole algorithm, the coordinate calculation of nodes in the network is based on discrete matrix decomposition. In order to overcome the shortcomings that the existing methods can not be applied to directed networks, two matrix decomposition techniques, singular value matrix decomposition and non-negative matrix decomposition, are used in the coordinate calculation method proposed in this paper. Each node is uniquely identified in the coordinate system by coordinate pairs composed of outgoing coordinates and incoming coordinates. When calculating the distance between point pairs, it is calculated by the inner product of the outgoing coordinates of the starting node and the incoming coordinates of the terminating nodes, thus overcoming the asymmetry constraint of the distance. In order to improve the accuracy of the existing algorithms and shorten the running time, the coordinate calculation method proposed in this paper draws lessons from the idea of robust principal component analysis to reduce dimension and Denoise, and reduces the main parts of low rank from the original distance matrix to eliminate errors and outliers, so as to achieve the effect of dimension reduction and denoising. In addition, the algorithm merges discrete matrix decomposition and coordinate calculation into a single optimization problem. The two processes are completed at the same time, and the error is reduced to a certain extent. In this paper, the algorithm is simulated on the real social software Facebook,Wiki,LiveJournal,Orkut and Gulps data sets, including functional simulation, simulation of influencing factors and extended application simulation, and compared with the existing methods. The experimental results show that compared with the existing methods, the proposed method not only improves the accuracy of the calculation results, reduces the time cost, but also improves the inapplicability of the proposed method to directed graphs.
【学位授予单位】:哈尔滨工业大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP393.09
本文编号:2509778
[Abstract]:With the popularity of social software, social networks related to it have gradually become the focus of academic research. When analyzing the topology of social networks, calculating the distance (defined as the number of edges that make up the shortest path between points) is the first step. At present, there are some classical shortest path distance solving algorithms, such as breadth first traversing (BFS), Dijkstra and Floyd. These algorithms have high time complexity and are suitable for ordinary networks, but not for networks with large data scale such as social networks. The distance prediction algorithm based on graphic coordinate system is a common algorithm in the distance prediction of social networks at present. It obtains part of the real distance through preprocessing, and then forecasts the other distance information with the help of coordinate system and part of the real distance, which greatly reduces the time cost of the calculation process. However, the existing methods only consider the modeling of social networks as undirected networks, which are only suitable for undirected networks, and there will be large errors in directed networks. In this paper, aiming at the shortcomings of the existing methods, further research is carried out, and a distance prediction algorithm for social networks based on inner product spatial coordinate system is proposed. In this algorithm, the coordinate system is used in the whole calculation process, and the social network is embedded in the coordinate system. The points in the social network correspond to the points in the coordinate system one by one. Through the unique identification of the coordinates, the distance between any point pairs in the social network is equivalent to the distance between the corresponding points in the coordinate system. As the most important part of the whole algorithm, the coordinate calculation of nodes in the network is based on discrete matrix decomposition. In order to overcome the shortcomings that the existing methods can not be applied to directed networks, two matrix decomposition techniques, singular value matrix decomposition and non-negative matrix decomposition, are used in the coordinate calculation method proposed in this paper. Each node is uniquely identified in the coordinate system by coordinate pairs composed of outgoing coordinates and incoming coordinates. When calculating the distance between point pairs, it is calculated by the inner product of the outgoing coordinates of the starting node and the incoming coordinates of the terminating nodes, thus overcoming the asymmetry constraint of the distance. In order to improve the accuracy of the existing algorithms and shorten the running time, the coordinate calculation method proposed in this paper draws lessons from the idea of robust principal component analysis to reduce dimension and Denoise, and reduces the main parts of low rank from the original distance matrix to eliminate errors and outliers, so as to achieve the effect of dimension reduction and denoising. In addition, the algorithm merges discrete matrix decomposition and coordinate calculation into a single optimization problem. The two processes are completed at the same time, and the error is reduced to a certain extent. In this paper, the algorithm is simulated on the real social software Facebook,Wiki,LiveJournal,Orkut and Gulps data sets, including functional simulation, simulation of influencing factors and extended application simulation, and compared with the existing methods. The experimental results show that compared with the existing methods, the proposed method not only improves the accuracy of the calculation results, reduces the time cost, but also improves the inapplicability of the proposed method to directed graphs.
【学位授予单位】:哈尔滨工业大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP393.09
【参考文献】
相关期刊论文 前2条
1 邢长友;陈鸣;;网络距离预测技术[J];软件学报;2009年09期
2 王意洁;李小勇;;网络距离预测技术研究[J];软件学报;2009年06期
,本文编号:2509778
本文链接:https://www.wllwen.com/guanlilunwen/ydhl/2509778.html