异构网络中的社团检测算法研究及应用
发布时间:2019-04-04 11:57
【摘要】:异构信息网络是一种拥有多种类型的结点与链接的复杂网络,这些结点与链接蕴藏着丰富的语义信息,给当前数据挖掘领域带来了更多的研究机会与挑战。近年来,研究者们针对异构信息网络,分别在相似性度量、图聚类、链路预测以及推荐等方向做出了许多成果。本文以异构信息网络为研究对象,主要在社团检测和推荐系统两个方面进行研究。传统的异构信息网络中社团检测的方法主要有基于排序、基于路径与多视角学习三种类型,前两者多根据概率图模型来求解模型,后者则主要利用多视角学习方法来解决异构网络中的问题。而基于异构网络的推荐系统则可以看做是基于多源信息融合后的推荐,主要以融合策略和融合信息来提高推荐性能。与之不同的是,本文以全新的角度(将异构信息网络挖掘转化为同构信息网络挖掘)出发,借助信息在元路径上的有效传播,提出一种分解技术,能够在无信息损失的前提下将原始异构信息网络分解为一系列同构信息网络。同时基于该分解策略,本文分别提出了一种异构信息网络的社团检测算法HomClus与一种融合用户与项目信息的推荐方法CSR。这三者构成了本文的核心内容,本文的主要贡献如下:第一、提出了异构信息网络的基于元路径的分解策略。该策略主要利用元路径反映实体间的不同关系的本质,针对目标类型实体,通过简单的矩阵操作得到不同路径下目标类型实体的关系权重矩阵——也就是同构信息网络。且该过程对目标类型而言没有信息损失。因此,对异构网络的相关研究问题都可以简化为在目标类型的同构网络上的研究问题,从而更容易被解决。第二、提出了基于异构信息网络的分解策略的社团检测算法HomClus。该方法在第一个贡献成果的条件下,首先将异构信息网络转化为一系列同构信息网络,并整合为统一的网络结构。其次,使用非负矩阵分解快捷地将节点转化为向量,即将整个网络投影到低维子空间中。最后,采用高效的聚类方法如基于同步的聚类方法对低维子空间中的“节点”进行聚类,从而检测出原始网络中潜在的社团结构。实验表明,HomClus算法与领域内的前沿算法相比有很大的优势,如算法直观简洁,参数不敏感,同时也验证了异构信息网络分解策略的有效性与实用性。第三、提出了基于异构信息网络的分解策略的推荐算法CSR。该方法针对推荐系统中典型的实体对象——用户与项目,利用异构信息网络的分解策略,将用户的异构信息、项目的异构信息同时转化为同构信息。并受当前较为流行的基于渐近因子模型的推荐方法与相似性正则化的启发,将用户信息、项目信息以及评分信息三者以集体相似性正则化一致逼近的形式有效地融合在一起,最后产生高质量的推荐结果。
[Abstract]:Heterogeneous information network is a complex network with many kinds of nodes and links. These nodes and links contain abundant semantic information, which brings more research opportunities and challenges to the current field of data mining. In recent years, researchers have made many achievements in similarity measurement, graph clustering, link prediction and recommendation for heterogeneous information networks. In this paper, heterogeneous information network as the research object, mainly in the community detection and recommendation system two aspects. The traditional methods of community detection in heterogeneous information networks are mainly based on ranking, path-based learning and multi-perspective learning, and the first two are based on probability graph model to solve the model. The latter mainly uses the multi-perspective learning method to solve the problems in heterogeneous networks. The recommendation system based on heterogeneous network can be regarded as a recommendation based on multi-source information fusion, mainly based on fusion strategy and fusion information to improve the performance of recommendation. In contrast, from a new perspective (transforming heterogeneous information network mining into isomorphic information network mining), this paper puts forward a decomposition technique with the help of the effective propagation of information in meta-path, which is based on the transformation of heterogeneous information network mining into isomorphism information network mining. The original heterogeneous information network can be decomposed into a series of isomorphic information networks without information loss. At the same time, based on this decomposition strategy, a community detection algorithm for heterogeneous information network (HomClus) and a recommendation method (CSR.) for fusion of user and project information are proposed in this paper. The main contributions of this paper are as follows: firstly, a meta-path-based decomposition strategy for heterogeneous information networks is proposed. This strategy mainly uses meta-path to reflect the essence of different relationships between entities. According to the object-type entity, the relation weight matrix of the target-type entity under different paths is obtained by simple matrix operation, that is, isomorphic information network. And this process has no loss of information for the target type. Therefore, the related research problems of heterogeneous networks can be simplified to the research problems on the target-type isomorphic networks, so that they can be solved more easily. Secondly, a community detection algorithm HomClus. based on the decomposition strategy of heterogeneous information network is proposed. Under the condition of the first contribution, the method firstly transforms the heterogeneous information network into a series of isomorphic information networks and integrates them into a unified network structure. Secondly, the non-negative matrix decomposition is used to quickly transform the nodes into vectors, that is, the whole network is projected into the low-dimensional subspace. Finally, efficient clustering methods, such as synchronization-based clustering, are used to cluster "nodes" in low-dimensional subspaces, so as to detect the potential community structure in the original network. Experiments show that the HomClus algorithm has great advantages over the frontier algorithms in the field, such as simple and intuitive algorithm and insensitive parameters. At the same time, it also verifies the effectiveness and practicability of the heterogeneous information network decomposition strategy. Thirdly, the recommendation algorithm CSR. based on the decomposition strategy of heterogeneous information network is proposed. Aiming at the typical entity object in recommendation system-user and project, this method uses the decomposition strategy of heterogeneous information network to transform the heterogeneous information of user and the heterogeneous information of project into isomorphic information at the same time. Inspired by the current popular recommendation method based on asymptotic factor model and similarity regularization, the user information, project information and scoring information are effectively fused together in the form of collective similarity regularization and uniform approximation. Finally, high-quality recommendations are produced.
【学位授予单位】:电子科技大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:O157.5
[Abstract]:Heterogeneous information network is a complex network with many kinds of nodes and links. These nodes and links contain abundant semantic information, which brings more research opportunities and challenges to the current field of data mining. In recent years, researchers have made many achievements in similarity measurement, graph clustering, link prediction and recommendation for heterogeneous information networks. In this paper, heterogeneous information network as the research object, mainly in the community detection and recommendation system two aspects. The traditional methods of community detection in heterogeneous information networks are mainly based on ranking, path-based learning and multi-perspective learning, and the first two are based on probability graph model to solve the model. The latter mainly uses the multi-perspective learning method to solve the problems in heterogeneous networks. The recommendation system based on heterogeneous network can be regarded as a recommendation based on multi-source information fusion, mainly based on fusion strategy and fusion information to improve the performance of recommendation. In contrast, from a new perspective (transforming heterogeneous information network mining into isomorphic information network mining), this paper puts forward a decomposition technique with the help of the effective propagation of information in meta-path, which is based on the transformation of heterogeneous information network mining into isomorphism information network mining. The original heterogeneous information network can be decomposed into a series of isomorphic information networks without information loss. At the same time, based on this decomposition strategy, a community detection algorithm for heterogeneous information network (HomClus) and a recommendation method (CSR.) for fusion of user and project information are proposed in this paper. The main contributions of this paper are as follows: firstly, a meta-path-based decomposition strategy for heterogeneous information networks is proposed. This strategy mainly uses meta-path to reflect the essence of different relationships between entities. According to the object-type entity, the relation weight matrix of the target-type entity under different paths is obtained by simple matrix operation, that is, isomorphic information network. And this process has no loss of information for the target type. Therefore, the related research problems of heterogeneous networks can be simplified to the research problems on the target-type isomorphic networks, so that they can be solved more easily. Secondly, a community detection algorithm HomClus. based on the decomposition strategy of heterogeneous information network is proposed. Under the condition of the first contribution, the method firstly transforms the heterogeneous information network into a series of isomorphic information networks and integrates them into a unified network structure. Secondly, the non-negative matrix decomposition is used to quickly transform the nodes into vectors, that is, the whole network is projected into the low-dimensional subspace. Finally, efficient clustering methods, such as synchronization-based clustering, are used to cluster "nodes" in low-dimensional subspaces, so as to detect the potential community structure in the original network. Experiments show that the HomClus algorithm has great advantages over the frontier algorithms in the field, such as simple and intuitive algorithm and insensitive parameters. At the same time, it also verifies the effectiveness and practicability of the heterogeneous information network decomposition strategy. Thirdly, the recommendation algorithm CSR. based on the decomposition strategy of heterogeneous information network is proposed. Aiming at the typical entity object in recommendation system-user and project, this method uses the decomposition strategy of heterogeneous information network to transform the heterogeneous information of user and the heterogeneous information of project into isomorphic information at the same time. Inspired by the current popular recommendation method based on asymptotic factor model and similarity regularization, the user information, project information and scoring information are effectively fused together in the form of collective similarity regularization and uniform approximation. Finally, high-quality recommendations are produced.
【学位授予单位】:电子科技大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:O157.5
【参考文献】
相关期刊论文 前2条
1 张邦佐;桂欣;何涛;孙焕W,
本文编号:2453785
本文链接:https://www.wllwen.com/kejilunwen/yysx/2453785.html