基于异构网络的关系推理与预测方法研究
发布时间:2018-04-28 00:41
本文选题:异构网络 + 预测 ; 参考:《太原理工大学》2017年硕士论文
【摘要】:随着互联网技术的迅速发展,社交网络正在快速融入大众的日常生活。人物个体间的关系是社交网络平台赖以生存和发展的重要组成,社交网络中包含的丰富信息为舆情监测、信息传播研究、广告投放等提供强有力的支持。但是,快速发展的网络技术在带来海量数据时也带来虚假信息、信息缺失等噪音问题,如何在可观察数据中预测与还原缺失的信息以及对隐含信息的挖掘成为当前一个重要课题。社交网络的信息挖掘是社会网络分析的重要组成部分,也是当前的研究热点。现实生活中人们通过不同类型的关系相互联系并构成社交网络,该网络包含相同类型的节点且节点间关系类型多样甚至会存在多种关系,此种类型的网络属于异构网络。然而当前社会网络分析主要围绕同构网络方面来研究,但对异构网络进行深入分析易于发现更加精准的隐含知识。因此,本文着眼于由人物个体以及人物间多关系所构成的异构社交网络,将关系大致分为亲属关系和社会关系两类,针对这两类关系的自身特点,采用不同的策略对该异构网络中的关系进行挖掘和预测。总体来说,本文的主要研究内容包含下述三个方面:1、针对互联网大数据体量庞大,信息噪音严重的现状,利用爬虫程序采集与处理了大量的百度百科名人基本信息及其相关人物关系。为了更加高效存储与利用这些数据,采用了可以充分反映人物之间关系语义和联系的图数据库Neo4j。2、分析了当前亲属关系推理研究大多数围绕专家系统或汉语言文学方面的现状,无法满足大数据时代下使用大数据量时的应用需求。本文根据常识和社会学知识,定义了常用的亲属关系及其表示方法,同时借鉴一阶谓词逻辑形式,制定了亲属关系推理规则。由于谓词逻辑形式的推理规则无法直接用于图数据库上实现推理,因此将谓词逻辑规则转换为图数据库操作语言,极大方便了亲属关系的推理与补全。同时采用了三种亲属关系推理方式,以满足亲属关系在不同应用场景中的需求。3、在社会关系预测方面,首先对研究问题进行描述和定义,明确了研究内容;其次,分析了本文研究内容与当前异构网络关系链路预测的不同,提出了无需预先设定路径模式的情况下可自动发现关系路径,并从多角度衡量关系路径重要程度后获得最大可达路径的方法;再次,基于此方法建立了一种异构网络社会关系预测算法,实现了人物间多类型社会关系的预测;最后分别运用本文算法与相关传统算法进行关系预测实验,经过实验结果的对比和分析后进一步证实了本文算法的有效性与准确性。
[Abstract]:With the rapid development of Internet technology, social networks are rapidly integrating into the daily life of the public. The relationship between individuals is an important component of the social network platform for survival and development. The rich information contained in the social network provides strong support for public opinion monitoring, information dissemination research, advertising and so on. However, the rapid development of network technology in bringing mass data also brings false information, information loss and other noise problems, How to predict and restore missing information in observable data and how to mine hidden information has become an important issue. Social network information mining is an important part of social network analysis, and it is also a hot research topic. In real life, people connect with each other through different types of relationships and form a social network. The network contains the same type of nodes and there may even exist a variety of relationships between nodes. This type of network belongs to heterogeneous networks. However, the current social network analysis mainly focuses on isomorphic networks, but it is easy to find more accurate implicit knowledge by in-depth analysis of heterogeneous networks. Therefore, this paper focuses on the heterogeneous social network composed of personas and their multi-relationships, and divides the relationships into two types: kinship and social relations, aiming at the characteristics of these two kinds of relationships. Different strategies are adopted to mine and predict the relationships in the heterogeneous network. In general, the main research contents of this paper include the following three aspects: 1. Aiming at the current situation of big data's huge volume and serious information noise on the Internet, Using crawler program to collect and deal with a large number of Baidu encyclopedia celebrity basic information and related relationships. In order to store and utilize these data more efficiently, a graph database, Neo4j.2, which can fully reflect the relationship semantics and relations between people, is adopted. The current research on kinship reasoning is mostly focused on expert system or the present situation of Chinese language and literature. Can not meet big data era under the use of large amounts of data application requirements. Based on common sense and sociological knowledge, this paper defines the commonly used kinship relations and their representation methods, and formulates the inference rules of kinship relations with reference to the first-order predicate logic form. Because the inference rules in the form of predicate logic can not be directly used to realize reasoning on graph database, it is convenient to infer and complement the kinship relationship by converting predicate logic rules into the operation language of graph database. At the same time, three kinds of kinship reasoning methods are adopted to meet the needs of kinship in different application scenarios. In the aspect of social relationship prediction, the research problems are first described and defined, and the research content is clarified. This paper analyzes the difference between the research content and the current relationship link prediction in heterogeneous networks, and proposes that the relationship path can be automatically discovered without pre-setting the path mode. The method of maximum reachable path is obtained by measuring the importance of relational path from many angles. Thirdly, a prediction algorithm of heterogeneous network social relations is established based on this method, and the prediction of multi-type social relations among people is realized. Finally, the relationship prediction experiments are carried out by using the proposed algorithm and the traditional algorithms, and the validity and accuracy of the proposed algorithm are further verified by the comparison and analysis of the experimental results.
【学位授予单位】:太原理工大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP393.09;TP311.13
【相似文献】
相关期刊论文 前10条
1 徐鹏;方旭明;向征;陈美荣;;异构网络选择的一种新博弈模型[J];电讯技术;2011年02期
2 陈庆章;组建异构网应着重考虑的问题和产品现状[J];计算机时代;1994年03期
3 汪芸,顾冠群,谢俊清,兑继英,孙昌平;异构网络集成方法研究[J];计算机研究与发展;1997年03期
4 马义忠;杨红旗;高彦;専秋峰;;基于移动Agent异构网络管理的分析与设计[J];微计算机信息;2008年33期
5 黄川;郑宝玉;;多无线电协作技术与异构网络融合[J];中兴通讯技术;2008年03期
6 吴蒙;季丽娜;王X;;无线异构网络的关键安全技术[J];中兴通讯技术;2008年03期
7 李R,
本文编号:1813038
本文链接:https://www.wllwen.com/shoufeilunwen/xixikjs/1813038.html