面向属性网络图的表示学习与链接预测
发布时间:2018-05-18 21:08
本文选题:属性网络图 + 表示学习 ; 参考:《华东师范大学》2017年硕士论文
【摘要】:随着信息数据的爆发式增长,对于社交网络、生物信息网络等大规模网络图的分析与挖掘引起了越来越多的关注,网络图划分、聚类、链接预测、社群搜索等问题都已经形成了较为独立的研究方向。网络表示学习(NetworkRepresentation Learning)任务的目标是将图中的节点用连续的低维向量表示,然后可以在向量空间下使用传统的聚类、分类等方法完成进一步的工作,因此被认为是众多图挖掘工作的基础。然而目前大多数的工作仅基于图的拓扑结构训练向量,忽视了节点本身丰富的内容信息。本文对前人的工作进行改进,提出了基于随机游走与词向量模型的属性网络图表示学习模型,本模型除了得到节点向量以外,还可以获得属性的低维向量表示,然后在所得向量的基础上进一步提出基于向量相似度的快速链接预测(Link Prediction)算法。本文主要包括以下四方面的工作:·属性网络图表示学习模型本文面向属性网络图,提出了基于随机游走与词向量模型的表示学习模型,训练得到的节点与属性向量能够保留原始网络图的结构完整性与属性完整性。·属性网络图的链接预测算法在网络表示学习模型的基础上进一步提出了链接预测算法。为了优化效率,设计了 BMH(Balanced Min-Hash)方法代替传统的Min-Hash,将属性与拓扑结构结合在一起生成最小签名矩阵,然后引入局部敏感哈希技术(Local Sensitive Hash,简称LSH)减少候选节点对的数量。·表示学习模型效果验证在多标签分类实验中,将本文提出的表示学习模型与几种基于内容或拓扑结构的模型进行对比。实验表明本文模型用于分类的准确率高,收敛速度快,且在不同的参数空间下鲁棒性强。·链接预测算法准确率验证实验验证了本文提出的基于表示学习的链接预测方法的准确率,以及LSH与BMH的结合使用的加速效果。
[Abstract]:With the explosive growth of information data, more and more attention has been paid to the analysis and mining of large-scale network maps, such as social networks, biological information networks and so on. Community search and other issues have formed a relatively independent research direction. The goal of the network representation learning task is to represent the nodes in the graph as continuous low-dimensional vectors, and then use traditional clustering, classification and other methods in vector space to accomplish further work. Therefore, it is considered to be the basis of many map mining work. However, most of the work is only based on graph topology training vector, ignoring the rich content information of nodes themselves. In this paper, an attribute network representation learning model based on random walk and word vector model is proposed, which not only obtains node vectors, but also obtains low-dimensional vector representation of attributes. Then a fast link prediction algorithm based on vector similarity is proposed. This paper mainly includes the following four aspects: the representation learning model of attribute network graph. This paper presents a representation learning model based on random walk and word vector model, which is oriented to attribute network graph. The trained nodes and attribute vectors can preserve the structural integrity and attribute integrity of the original network graph. The link prediction algorithm of the attribute network graph is further proposed on the basis of the learning model of the network representation. In order to optimize the efficiency, the BMH(Balanced Min-Hash-based method is designed instead of the traditional Min-Hash. the attribute and topology are combined to generate the minimum signature matrix. Then the local sensitive hashing technique is introduced to reduce the number of candidate node pairs. It is shown that the effectiveness of the learning model is verified in the multi-label classification experiment. The representation learning model proposed in this paper is compared with several models based on content or topology. Experimental results show that the proposed model has high accuracy and fast convergence. And the link prediction algorithm is robust in different parameter spaces. The accuracy of the link prediction algorithm based on representation learning is verified by the experimental results, as well as the acceleration effect of the combination of LSH and BMH.
【学位授予单位】:华东师范大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:O157.5
,
本文编号:1907215
本文链接:https://www.wllwen.com/kejilunwen/yysx/1907215.html