稀疏数据的贝叶斯网络结构学习
本文选题:稀疏数据 切入点:贝叶斯网络 出处:《山东师范大学》2017年硕士论文
【摘要】:图模型被广泛用于表示和分析随机变量之间的因果关系以及条件独立性.图模型中主要包括有向无环图,无向图和链图.其中有向无环图,也被称作贝叶斯网络,图中的边都是有向边,并且不能构成有向环.贝叶斯网络用来描述随机变量的因果关系.本文主要提出对贝叶斯网络结构进行学习的算法.贝叶斯网络结构学习算法主要有三类:1基于独立性检验的约束算法;2基于评分搜索的算法;3将独立性检验和评分搜索综合利用起来的算法.2008年,Xie和Geng针对贝叶斯网络结构学习提出贝叶斯网络结构学习的递归分解算法.这个算法是将大规模的贝叶斯网络结构学习问题递归地分割为较小规模的结构学习的问题.这个算法主要应用无向独立图的构建,这就导致了它有两个困难:一是在数据稀疏,变量较多的情况下无向独立图的构建不够准确;二是当变量较多时,无向图的构建也是比较复杂的.2013年Cai等人提出了可测因果分割算法(Scalable cAusation Discovery Al-gorithm, SADA) .这个算法将变量集 V 分割成三个集合 (V1,C,V2),其 中只要保证在给定C的情况下,V1与V2之间没有边直接相连即可,并不需要它们条件独立,所以能够解决数据稀疏的困难.但是,Cai的算法在合并的过程中有可能出现假边,针对这个问题,本论文提出一种再学习的检查学习算法,这种算法的提出,结合了Cai和Xie算法中的优势,解决了稀疏数据贝叶斯网络结构学习的问题.本文提出的算法,首先将一个贝叶斯网络的变量集合不断地调用可测因果分割算法进行分割;然后在每一组因果分割上,先进行局部结构学习再进行合并得到可能有假边存在整体结构;最后寻找因果分割集及其邻居集合,在这之上调用再学习检查算法,进行修正学习以得到正确贝叶斯网络的骨架图,最后利用Meek准则确定出等价类来.
[Abstract]:Graph models are widely used to express and analyze causality and conditional independence between random variables.The graph model mainly includes directed acyclic graph, undirected graph and chain graph.The directed acyclic graph is also called Bayesian network. The edges of the graph are directed edges and cannot form directed rings.Bayesian networks are used to describe the causality of random variables.In this paper, a learning algorithm for Bayesian network structure is proposed.There are three kinds of Bayesian network structure learning algorithms: one is a constraint algorithm based on independence test; the other is an algorithm based on score search. In 2008, Xie and Geng aimed at BayesianA recursive decomposition algorithm for learning Bayesian network structure is proposed.This algorithm recursively divides the large-scale Bayesian network structure learning problem into smaller scale structural learning problems.This algorithm mainly applies the construction of undirected independent graph, which leads to two difficulties: one is that the construction of undirected independent graph is not accurate enough when the data is sparse, and the other is that when there are more variables, the construction of undirected independent graph is not accurate enough.The construction of undirected graph is also complicated. In 2013, Cai et al proposed scalable cAusation Discovery algorithm (SADAA).In this algorithm, the variable set V is divided into three sets: v _ 1 / C _ 1 / V _ 2, which ensures that there is no direct connection between V _ 1 and V _ 2 under a given C condition, and they do not need to be conditional independent, so the difficulty of data sparsity can be solved.However, false edges may appear in the merging process of Cai algorithm. In view of this problem, this paper proposes a relearning check learning algorithm, which combines the advantages of Cai and Xie algorithms.The problem of sparse data Bayesian network structure learning is solved.The algorithm proposed in this paper firstly segments a set of Bayesian network variables by calling the testable causality segmentation algorithm continuously, and then on each set of causal segmentation,The local structure learning is carried out first, then the global structure with false edges is obtained. Finally, the causal partition set and its neighbor set are found, and then the learning and re-checking algorithm is called.The correct skeleton diagram of Bayesian network is obtained by modified learning, and the equivalent class is determined by using Meek criterion.
【学位授予单位】:山东师范大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:F224
【相似文献】
相关期刊论文 前1条
1 祝琴;戴爱明;;高维稀疏数据对象—属性的非关联子空间分析[J];中国管理信息化;2011年09期
相关会议论文 前2条
1 李博多;李建中;高宏;彭丽萍;;一种支持大规模稀疏数据表上相似性查询的索引设计[A];第二十五届中国数据库学术会议论文集(一)[C];2008年
2 陈郁馨;程序;赵鹏;孟必平;李红燕;王腾蛟;;云环境中一种面向海量稀疏数据存储的缺失值处理方法[A];第29届中国数据库学术会议论文集(B辑)(NDBC2012)[C];2012年
相关博士学位论文 前2条
1 曾玉华;稀疏数据恢复的结构优化模型及其算法研究[D];湖南大学;2016年
2 袁宇;稀疏数据条件下河流入海污染物通量的估算[D];东北大学;2009年
相关硕士学位论文 前10条
1 杨玉;稀疏数据的贝叶斯网络结构学习[D];山东师范大学;2017年
2 陈婷婷;面向稀疏数据的微分隐私数据发布[D];福州大学;2014年
3 刘帅;高维稀疏数据的相关性度量方法研究[D];首都经济贸易大学;2014年
4 尹松;高属性维稀疏数据动态抽象聚类方法研究[D];广西大学;2005年
5 张珍;贝叶斯Meta-分析[D];山东师范大学;2017年
6 任健;基于压缩感知的非均匀空间立体阵SAR三维层析成像[D];哈尔滨工业大学;2014年
7 李晓萍;有约束条件优化问题的MM算法[D];兰州大学;2017年
8 田正东;基于子空间分析的DOA估计算法研究[D];南京邮电大学;2017年
9 赵程檐;花授粉算法的研究及应用[D];广西民族大学;2017年
10 叶晓平;高阶多模型状态估计算法及应用[D];哈尔滨工业大学;2017年
,本文编号:1728547
本文链接:https://www.wllwen.com/jingjifazhanlunwen/1728547.html