基于不确定理论的云数据处理关键技术研究

发布时间:2019-06-18 18:05
【摘要】:2016年1月,RightScale对全球1000多个企业用户进行了关于公有云、私有云和混合云的使用情况调查,该调查报告结果显示95%的受访者正在使用云。现实世界中,不确定因素普遍存在于各种现象中。在云计算环境下,云数据中心中云数据、虚拟机的迁移、调度等问题都具有不确定性。对于不确定性数据处理,目前已有很多成果,多集中在实体数据的不确定性,对现实中一些实际问题覆盖还不够。对于实体间关系的非确定性处理,已有文献运用随机和模糊理论解决近邻查询处理问题。而实体间关系有时还表现为主观不确定性,这种主观不确定性既不是随机的也不是模糊的。现实中,很多问题无法获得历史数据,从而无法用概率论求解事件发生的频率,此时必须依据专家经验对事件可能发生的信度进行评估,此方法使得信度的方差远远大于频率。为了处理云数据的主观不确定性,将采用不确定理论对云数据的处理技术展开研究。本文致力于云数据查询处理、查询优化关键技术的研究,由于异构性、隐私性、隐私保护、数据不完整、数据不精确等原因,云数据中心的数据存在不确定性,借鉴和吸收不确定理论的相关研究,将云数据中心抽象为不确定图。根据不确定图的路径查询算法,对云数据的查询处理、查询优化进行深入的探讨,本文的主要工作和贡献可以归纳为:(1)提出了云数据安全防护框架。该框架主要包括物理安全、虚拟网络安全、云操作系统安全、虚拟集群安全、数据安全、SaaS/PaaS/IaaS安全、安全管理与安全运维等层次模块。该框架在安全目标、系统资源类型、基础安全技术方面与传统安全是相同的,而又有其特有的安全问题,主要包括:虚拟化安全问题和与云计算分租服务模式相关的一些安全问题。该框架在虚拟化安全、数据安全和隐私保护等方面具有更好的安全性和保护能力。(2)提出了基于云数据安全防护框架的不确定随机故障树风险分析方法。该方法基于不确定理论和机会理论,对故障树进行构建和分析。故障树由基于底事件之间的逻辑关系构成。若底事件的故障率由历史数据获得,则被表征为随机变量:若没有历史数据,但可从专家主观判断得到,则被表征为不确定变量。除此之外事件发生的机会是不确定的随机变量,因此构建了混合仿真算法来计算顶事件发生的机会。通过不确定随机故障树分析法对所提出的云数据安全防护框架进行风险分析。(3)提出了不确定网络条件可信近邻查询方法。该方法包括可信距离的计算(CMCD)算法,可达路径长度计算(CMFP)算法,可达路径期望长度计算(CMDLFP)算法,条件可信k近邻查询(QMCCK)算法。将不确定网络建模为不确定赋权图,定义不确定图的样本图,样本图指数,基础网络,可达路径长度及可达路径期望长度,并给出基于不确定理论的高效不确定条件可信近邻查询算法。将不确定网络上的近邻查询等价地转化为基础网络上的近临查询问题。该可信近邻查询算法能够从非确定角度解决不确定网络环境下的近邻查询问题。(4)提出了基于不确定理论的不确定性数据Top-k查询算法。将不确定性数据集中的元组建模为不确定网络,将有序元组的Top-k查询等价转化为相应样本图中边的不确定测度关系,并对样本图依据所包含边的排序位置进行分类,该算法避免计算所有元组在样本图中的排名不确定测度值,提高了不确定性数据的Top-k查询计算效率。将不确定性数据中,基于参数化排名函数的Top-k查询等价转换为依Top-k值不同的有限查询,并结合Spark Map-Reduce编程框架完成了系统实现。
[Abstract]:In January 2016, the RightScale conducted a survey of the use of public clouds, private clouds and hybrid clouds for more than 1,000 enterprise users worldwide, and the survey found that 95% of the respondents were using the cloud. In the real world, uncertainty is common in various phenomena. In the cloud computing environment, the cloud data in the cloud data center, the migration and the scheduling of the virtual machine and the like have the uncertainty. There are many achievements in the data processing of the uncertainty, and the uncertainty of the entity's data is not enough to cover some of the real problems in the real world. For the non-deterministic processing of the relation between the entities, the existing literature uses the random and fuzzy theory to solve the problem of neighbor query processing. The relationship between the entities is sometimes also subjective, and the subjective uncertainty is neither random nor fuzzy. In reality, many problems can't get the historical data, so we can not use the probability theory to solve the frequency of the event. At this time, it is necessary to evaluate the reliability of the event based on the experience of the experts, which makes the variance of the reliability far greater than the frequency. In order to deal with the subjective uncertainty of cloud data, the process technology of cloud data will be studied with the uncertainty theory. This paper is devoted to the research of the key technology of cloud data query processing and query optimization. Because of the heterogeneity, privacy, privacy protection, incomplete data and inaccurate data, the data of the cloud data center is uncertain, and the relevant research of the uncertainty theory is used for reference and absorption. The cloud data center is abstracted as an uncertainty diagram. According to the path query algorithm of the uncertain graph, the query processing and query optimization of the cloud data are discussed in-depth. The main work and contribution of this paper can be summarized as follows: (1) The cloud data safety protection framework is proposed. The framework mainly includes physical security, virtual network security, cloud operating system security, virtual cluster security, data security, SaaS/ PaaS/ IaaS security, security management and security operation and maintenance level modules. The framework is the same as the traditional security in the aspects of security objective, system resource type and basic security technology, but also has the special security problem, mainly including: the virtualization security problem and some safety problems related to the cloud computing sublease service mode. The framework has better security and protection capabilities in terms of virtualization security, data security, and privacy protection. (2) The risk analysis method for uncertain random fault tree based on cloud data security protection framework is presented. The method is based on the theory of uncertain theory and opportunity, and the fault tree is constructed and analyzed. The fault tree is composed of a logical relationship based on the bottom event. If the failure rate of the bottom event is obtained from the historical data, it is characterized as a random variable: if there is no historical data, it can be obtained from the subjective judgment of the expert and is characterized as an uncertain variable. In addition, the chance of the occurrence of the event is an uncertain random variable, so a hybrid simulation algorithm is constructed to calculate the opportunity for the top event to occur. The proposed cloud data safety protection framework is analyzed by uncertain stochastic fault tree analysis. And (3) the method for querying the trusted neighbor of the network condition is proposed. The method comprises a CMDCD algorithm, a reachable path length calculation (CMDFP) algorithm, a reachable path expectation length calculation (CMDLFP) algorithm, and a conditional trusted k-neighbor query (QMCCK) algorithm. The uncertain network is modeled as an uncertain weight graph, a sample graph, a sample map index, a basic network, a reachable path length and a reachable path expectation length of the uncertain graph are defined, and an efficient and uncertain conditional trusted neighbor query algorithm based on the uncertainty theory is given. The neighbor query on the network is not determined to be equivalently converted into a near-access query problem on the base network. The trusted neighbor query algorithm can solve the problem of neighbor query in the uncertain network environment from the non-deterministic point of view. (4) An uncertain data Top-k query algorithm based on uncertain theory is proposed. the meta-establishment model in the uncertainty data set is a non-deterministic network, and the top-k query of the ordered tuple is equivalent to the uncertainty measure relation of the edge in the corresponding sample graph, and the sample graph is classified according to the sorting position of the included edge, The algorithm avoids the calculation of the uncertainty measure value of all the tuples in the sample graph, and improves the top-k query calculation efficiency of the uncertainty data. In the uncertain data, the top-k query based on the parameterized ranking function is equivalent to a limited query different according to the Top-k value, and the system implementation is completed in combination with the Spark Map-Reduce programming framework.
【学位授予单位】:北京科技大学
【学位级别】:博士
【学位授予年份】:2016
【分类号】:TP309

【相似文献】

相关期刊论文 前10条

1 ;连续性数据分组怎样表示最科学?[J];无锡职业技术学院学报;2012年04期

2 葛杨;徐名海;迟欢;;关于传输虚拟化中数据分组乱序问题的研究[J];电信科学;2012年10期

3 叶玉杰;邱丘;陈亚军;张永忠;刘嵩;胡霞敏;;运用Origin软件处理药学实验数据[J];药学实践杂志;2013年06期

4 董永吉;郭云飞;黄万伟;黄慧群;;面向深度分组检测的高速数据分组解析结构[J];通信学报;2013年06期

5 高卫民;数据处理组合软件的原理与技术[J];计算机应用研究;1991年02期

6 王锐;陈丽;马方明;;一种电信行业海量数据分组统计方法[J];计算机应用与软件;2012年12期

7 ;艾法斯推出业界最真实的LTE服务测试流量场景[J];移动通信;2012年24期

8 李玉海;田苗苗;黄刘生;杨威;;无线传感网络中基于数据混淆的保护隐私数据聚集协议[J];小型微型计算机系统;2013年07期

9 李霁;麻土华;;基于W函数的数据分组方法的算法实现[J];科技通报;2012年05期

10 郑海鸥,张乃通;无线数据扩频MODEM组网[J];无线电通信技术;2000年03期

相关重要报纸文章 前1条

1 本报记者 赵姗;大数据时代来临,中国准备好了吗?[N];中国经济时报;2013年

相关博士学位论文 前3条

1 刘海青;大规模VANET数据传输策略的研究[D];山东大学;2015年

2 郭长友;基于不确定理论的云数据处理关键技术研究[D];北京科技大学;2016年

3 刘琴;多用户共享云计算服务环境下安全问题研究[D];中南大学;2012年

相关硕士学位论文 前10条

1 张忠贺;Modbus/TCP协议在WIFI应用通讯下的实现[D];内蒙古大学;2015年

2 乔新生;量子密钥分发系统中数据协调算法及软件设计[D];西安电子科技大学;2014年

3 杨建东;云环境下网管数据查询系统设计[D];南京邮电大学;2015年

4 汪鹏飞;异构网络中基于MPTCP多路传输的数据调度策略研究[D];华中师范大学;2015年

5 刘娟;桥梁健康监测系统的设计与实现[D];电子科技大学;2011年

6 陈娜;基于Hadoop平台的海量数据处理应用[D];吉林大学;2012年

7 蒋杰;基于GPRS的嵌入式系统安全无堵塞通信研究与设计[D];哈尔滨工程大学;2013年

8 蒋颖;面向卷烟制丝线的SPC系统的应用研究与实现[D];湖南大学;2012年

9 罗恩泽;面向大规模工程机械远程智能监控的无线通信协议[D];湖南大学;2011年

10 付金光;电力系统不良数据辨识的实用软件开发[D];郑州大学;2011年



本文编号:2501680

资料下载
论文发表

本文链接:https://www.wllwen.com/shoufeilunwen/xxkjbs/2501680.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户2f565***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com