当前位置:主页 > 科技论文 > 数学论文 >

复杂网络数据模式挖掘与演化分析研究

发布时间:2018-06-26 01:02

  本文选题:网络数据 + 链路预测 ; 参考:《电子科技大学》2017年博士论文


【摘要】:大数据时代,数据通过“量化一切”形成数据世界。由于数据是世界的客观反映,所以数据的分析挖掘工作可以指导人们认识世界、改造世界。随着信息技术的发展普及,社会和企业都产生了海量的数据资源,需要被分析利用。同时,网络化是现实世界的普遍特征和内在规律,自然元素、物种人群等各种对象元素相互影响、相互依赖形成网络系统。由于数据产生的客观性和普遍性,数据世界中的数据资源基本上都是刻画网络化现实世界特征规律的网络化数据。另外,由于数据产生的弱约束性以及强覆盖性,收集的数据资源在客观、准确刻画现实世界的同时,具有多源多态、复杂异构特征。所以,当前数据处理的主要对象为海量的复杂异构网络数据。新型的复杂异构网络数据对传统数据处理技术产生了巨大的挑战。为了分析挖掘新型的复杂异构网络数据,本文探索研究基于数据特征的、面向现实需求的新型数据处理理论和模型。复杂异构网络数据主要包括网络结构数据、网络行为数据以及网络内容数据,本文从不用角度、不同需求、不同方法对复杂网络数据进行模式挖掘和演化分析研究,凝练复杂网络数据处理的研究范式和计算框架,探索复杂网络数据蕴含的科学问题、问题相关数据的特征规律以及问题的求解方案,构建复杂网络数据处理的技术体系。具体研究内容和创新点包括:1.基于标记传播的网络结构模式整体检测分析算法针对复杂异构的网络拓扑,以社团结构为主体、同时考虑网络节点的不同角色进行多尺度、多层次网络结构模式的挖掘研究,提出一个基于标记传播过程的网络结构模式发现算法LINSIA。LINSIA通过允许节点同时拥有不同的网络标记从而能够识别枢纽节点和重叠社团,通过构建多层次网络结构树并进行最优层次分割从而发现网络的多层次、多尺度结构模式,通过标记选择和标记更新策略的创新提出与网络异构程度相适应的标记传播过程,从而发现离群节点、避免极大社团。实验结果表明LINSIA算法性能良好,其关于网络结构模式挖掘的综合性解决方案对网络结构数据的分析研究工作具有重要的理论意义。2.面向最优网络分裂的节点中心性度量方法本文面向最优网络分裂问题,从微观角度探索网络的结构和功能特征,提出基于邻居节点度信息熵和本地结构聚类密度的ECI节点中心性。实验结果表明,ECI中心性在网络分裂过程中性能明显优于传统的CI中心性。同时,基于局部结构信息的ECI中心性取得了媲美全局性方法的分裂效果。本文通过分析ECI中心性的性能表现和网络结构特征之间的关联关系,对ECI中心性的适用范围进行讨论,为最优网络分裂问题中的节点中心性选择提供指导。另外,通过借鉴物质传播和热传导物理过程,本文在迭代更新框架中定义非线性混合更新机制,从而提出PIRank节点中心性。该中心性整合物质传播和热传导过程对网络重要节点的不同偏好,能够发现具有不同特征的网络重要节点。实验结果表明,PIRank节点中心性对最优网络分裂问题性能表现良好。3.基于节点位置漂移模型的动态网络演化预测算法针对动态演化网络,提出一种结合节点位置漂移模型和链路预测方法的网络演化预测算法。此工作首先提出以网络平均最短距离为指导的相似性度量WSD。然后,基于动态演化网络的聚集特性和时效特性定义邻居节点对中心节点的时空影响力,并以引力场的视角比较邻居节点的时空影响力强度和本地网络的固有结构强度,从而提出更新中心节点网络位置的时空漂移模型。算法基于此漂移模型推理动态网络未来的结构状态,并基于未来的网络结构状态预测未来的网络链路。实验结果表明,本文提出的相似性度量WSD与其它经典方法相比性能更优,结合位置漂移模型能够准确预测网络演化。4.基于个体转发行为建模的在线社交网络信息传播演化预测方法针对信息传播过程,提出基于微观个体转发行为估计的多尺度信息传播预测方法MScaleDP。MScaleDP适用于不同规模的信息传播过程、不依赖于任何全局信息。MScaleDP将信息传播过程分解为微观个体转发行为集合以及承载转发行为的网络拓扑结构。对于微观个体转发行为,MScaleDP从多个维度构建转发特征,并以二分类模型进行建模。MScaleDP考虑信息级联传播与标记传播方法LPA的内在一致性,以微观个体转发模型替代LPA的标记更新机制,并通过对LPA传播过程进行限制提出了 AULPA级联传播预测算法。实验结果表明结合个体转发行为估计模型和AULPA级联传播预测算法,MScaleDP能够全面、准确的预测信息传播,性能优于传统方法。本文还对影响信息传播的主要驱动机制进行了挖掘分析,发现时效特征和内容特征是信息传播的主要影响因素。综上,本文围绕复杂网络数据的模式挖掘和演化分析展开了研究,针对四个方面的问题提出了解决方案,并进行了大量的实验验证。实验结果表明,本文发现的特征规律以及提出的模型算法准确有效、性能优良。本文工作成果不仅具有重要的理论意义,也具有广泛的实际应用价值。
[Abstract]:As the data is the objective reflection of the world, data analysis and mining can guide people to know the world and transform the world. As the development and popularization of information technology, the society and enterprises have produced massive data resources and need to be analyzed and utilized. At the same time, the network can be used. It is the universal characteristic and inherent law of the real world. The elements of natural elements, species and other object elements influence each other and form a network system with each other. The data resources in the data world are basically networked data that depict the characteristics of the present world. In addition, because of the objectivity and universality of the data generation, the data resources in the data world are basically network data. The data generated by the weak constraints and strong coverage, the data resources collected are objectively and accurately depicting the real world, with multi source polymorphism and complex isomerism. Therefore, the main object of the current data processing is the massive complex heterogeneous network data. The new complex allosteric network data has produced a huge amount of traditional data processing technology. In order to analyze and excavate new complex heterogeneous network data, this paper explores the new data processing theory and model based on data feature and realistic demand. The data of complex heterogeneous network mainly include network structure data, network behavior data and network volume data. Methods the model mining and evolution analysis of complex network data are carried out. The research paradigm and calculation framework of complex network data processing are condensed. The scientific problems in the complex network data, the characteristics of the related data and the solution of the problems are explored, and the technical system of complex network data processing is constructed. The specific research content is studied. And the innovation points include: 1. the whole detection and analysis algorithm based on the network structure pattern based on the label propagation is based on the complex and heterogeneous network topology, taking the community structure as the main body, taking into account the different roles of the network nodes to carry on the multi scale and multi-level network structure pattern mining, and proposes a network structure model based on the markup propagation process. It is found that LINSIA.LINSIA can identify hub nodes and overlapping communities by allowing nodes to have different network markers at the same time. By constructing a multilevel network structure tree and optimizing hierarchical segmentation, the multi-layer and multi-scale structure pattern of the network is found, and the innovation and network of the label selection and labeling update strategy are proposed. In order to find out the outlier nodes and avoid the great community, the experimental results show that the LINSIA algorithm has good performance. The comprehensive solution of the network structure pattern mining has an important theoretical significance for the analysis and research of the network structure data, and the.2. is facing the optimal network splitting node. The method of heart measurement is oriented to the optimal network splitting problem. The structure and function characteristics of the network are explored from the microscopic point of view. The ECI node centrality based on the neighbor node degree information entropy and the local structure clustering density is proposed. The experimental results show that the performance of ECI centrality is obviously superior to the traditional CI centrality in the network splitting process. The ECI centrality of local structure information has achieved the split effect comparable to that of the global approach. By analyzing the relationship between the performance of the central ECI and the relationship between the network structure features, this paper discusses the applicable scope of the ECI centrality, and provides guidance for the central selection of the nodes in the optimal network splitting problem. In the physical process of mass propagation and heat conduction, this paper defines the nonlinear hybrid update mechanism in the iterative update framework, and proposes the centrality of the PIRank node. This centrality integrates the different preferences of the material propagation and heat conduction process to the important nodes of the network, and can discover the important network nodes with different characteristics. The experimental results show that the PIRank node is used. The performance of the point centrality is good for the optimal network splitting problem. The dynamic network evolution prediction algorithm based on the node position drift model is based on the node position drift model and the network evolution prediction algorithm combining the node position drift model and the link prediction method. The work is first proposed with the network average shortest distance as the guidance. The similarity measure WSD. then defines the spatial and temporal influence of the neighbor nodes on the central nodes based on the aggregation and aging characteristics of the dynamic evolutionary networks, and compares the spatial and temporal intensity of the neighbor nodes with the inherent structural strength of the local networks by the view of the gravitational field, and proposes a spatio-temporal drift model to update the location of the central node network. The algorithm is based on this drift model to inferring the structure state of the future dynamic network and forecast the future network link based on the future network structure state. The experimental results show that the proposed similarity measure WSD is better than other classical methods, and it can predict the network evolution.4. based on individual forwarding accurately with the location drift model. The online social network information propagation evolution prediction method of behavior modeling aims at the information propagation process, and proposes a multi-scale information propagation prediction method based on the estimation of micro individual forwarding behavior, MScaleDP.MScaleDP is suitable for different scale of information propagation process, and does not rely on any global information.MScaleDP to decompose the information propagation process into micro For the micro individual forwarding behavior, MScaleDP constructs the forwarding features from multiple dimensions, and takes the two classification model for modeling.MScaleDP to consider the intrinsic consistency of the information cascade propagation and the markup propagation method LPA, and substitutes the micro individual forwarding model to the standard of LPA. In this paper, the update mechanism is recorded, and the AULPA cascade propagation prediction algorithm is proposed by restricting the LPA propagation process. The experimental results show that combining the individual forwarding behavior estimation model and the AULPA cascade propagation prediction algorithm, the MScaleDP can predict information dissemination accurately and accurately, and the performance is superior to the transmission method. The dynamic mechanism is excavated and analyzed. It is found that the characteristics of time limitation and the characteristics of content are the main influencing factors of information dissemination. In this paper, the paper studies the pattern mining and evolution analysis of complex network data, and puts forward a solution for the four aspects, and has carried out a large number of experimental verification. The experimental results show that this paper finds out the results of this paper. The characteristic law and the proposed model algorithm are accurate and effective, and the performance is excellent. The results of this paper not only have important theoretical significance, but also have extensive practical application value.
【学位授予单位】:电子科技大学
【学位级别】:博士
【学位授予年份】:2017
【分类号】:O157.5

【参考文献】

相关期刊论文 前1条

1 ;促进大数据发展行动纲要[J];成组技术与生产现代化;2015年03期



本文编号:2068334

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/yysx/2068334.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户dccad***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com