多租户环境下面向SLO的资源动态平衡机制

发布时间：2018-06-06 06:33

本文选题：多租户 + 性能模型　；参考：《山东大学》2017年硕士论文

【摘要】：云计算时代的多租户SaaS(SoftwareasaService)应用的需求越来越大。多租户数据管理是SaaS应用快速开发和高效运行的关键。多租户数据管理通过共享资源节省了租户数据管理与维护成本,但从租户的角度来说,每个租户会与供应商签订服务水平协议(Service Level Agreement,SLA)。在本文中专注于与SLA相关联的性能指标-查询的响应时间,因此多租户数据查询仍需满足租户的服务水平目标(Service Level Objectives,SLO),即描述各方设定的基准或目标,涉及供应商在给定时期内为租户提供的服务。随着云计算时代多租户SaaS应用的推广,越来越多的供应商采用这种方式为租户提供服务。虽然多租户数据管理技术通过把租户整合在相同的节点上节约了资源和成本,但是租户之间资源共享也带来了一系列的问题和挑战。首先,为了提高系统的可用性和容错性,租户一般具有多个副本。不同节点上放置的租户副本是不一样的,而租户因为业务性质的不同,对资源的需求程度也会不一样。有的租户对CPU资源需求较多,对I/O资源需求较少,有的租户则刚好相反。如果把消耗CPU资源较多,I/O资源较少的租户查询调度到同一个节点上,就会造成该节点CPU资源利用率过高,不能满足租户的资源需求,而I/O资源利用率过低。因此,不合理的查询调度会导致资源的浪费,降低系统的性能。然后,多租户数据访问负载具有混合性,大波动,多变化等特征。因为租户放置在一起,共享资源,所以租户之间的性能是相互影响的。租户丛发型的工作负载会造成节点的资源利用率过高,处于过载状态,也会使节点上的其他租户得资源需求得不到满足,查响应时间过长,无法满足租户的SLO,产生性能危机。因此,如何平衡节点之间的资源使用,提高节点的资源利用率,消除节点的性能危机成为供应商越来越关注的问题。本文从用户实际存在的需求出发,针对现有工作存在的不足,对多租户环境中的过载问题进行了一系列的研究,并提出了资源动态平衡机制平衡数据节点的资源使用和消除节点的性能危机。本文的具体工作和贡献概括如下:1.提出了三种负载模型并详细介绍了这三种负载模型的构建过程。本文分别定义了三种负载模型:查询负载模型、租户负载模型和节点负载模型来分别表示查询、租户和节点的负载。首先根据查询的特点构建查询的负载模型,以其所消耗的服务器资源作为服务器总资源的百分比作为负载模型的基准。然后基于查询的负载模型采用线性累加方法构建租户负载模型和节点负载模型。最后通过实验验证了负载模型的准确性。2.针对多租户环境中不合理的查询调度策略造成数据节点资源使用不平衡问题,本文提出了基于负载模型和节点性能的动态查询调度策略。首先节点的性能与节点的资源消耗水平密切相关,为了实时监测节点的性能,本文基于节点中的资源使用率,训练节点的性能标签来标记节点的资源消耗水平。然后采用动态查询调度策略实时检测和统计所有数据节点的性能标签,并基于查询负载模型和节点的性能标签,动态的分配查询,在租户副本所在的数据节点上选择合适的节点执行,平衡数据节点之间的资源使用,提高节点资源利用率。3.针对多租户环境中由于租户丛发型的工作负载造成性能危机的问题,本文提出了一种轻量级的消除节点性能危机的负载均衡机制。在多租户数据库环境中,节点产生性能危机,会使查询的响应时间过长,租户的SLO违反率过高。本文提出一种通过交换租户的主副本和辅助副本的角色消除性能危机的轻量级的负载均衡机制。消除性能危机的原理是租户具有一个主副本和多个辅助副本,主副本和辅助副本都可以承担只读查询,但是写查询只能在主副本上。因此主副本承担的工作负载要比辅助副本多。如果某个节点负载过高,把节点上租户的主副本与其他节点上该租户的辅助副本交换角色,就可以把在该节点上的查询转移到其他节点上,节点的负载降低,性能危机消除。本文通过实验验证了资源动态平衡机制的有效性,资源动态平衡机制能够平衡数据节点的资源使用并快速有效的消除节点产生的性能危机。
[Abstract]:The demand for multi tenant SaaS (SoftwareasaService) applications in the cloud computing era is increasing. Multi tenant data management is the key to rapid development and efficient operation of SaaS applications. Multi tenant data management saves tenant data management and maintenance costs by sharing resources, but from the tenant's point of view, each tenant will sign the service with the supplier. Service Level Agreement (SLA). In this article, it is focused on the performance index associated with SLA - the response time of the query, so the multi tenant data query still needs to meet the tenant's service level target (Service Level Objectives, SLO), which describes the benchmarks or targets set by the parties, involving the supplier for the tenant in a given period. With the promotion of multi tenant SaaS applications in the cloud computing era, more and more vendors have used this approach to provide services to tenants. Although multi tenant data management techniques have saved resources and costs by integrating tenants on the same nodes, the sharing of resources among tenants also brings a series of problems and challenges. First, in order to improve the availability and fault tolerance of the system, the tenant usually has multiple copies. The tenant replicas on different nodes are different, and the tenants need different degree of demand for the resources because of the different business nature. Some tenants need more CPU resources, less I/O resources, and some tenants just the opposite. If the user with more CPU resources and less I/O resources is dispatched to the same node, the resource utilization rate of the node will be too high to satisfy the resource requirement of the tenant, but the utilization rate of the I/O resource is too low. Therefore, the unreasonable query scheduling will lead to the waste of the resources and reduce the performance of the system. Then, the multi tenant data will be reduced. The performance of the tenants is interacted with each other. The tenant - hairstyle workload will cause the resource utilization of the node to be too high and overloaded, and the other tenants on the node will not be satisfied with the resource requirements. When the response time is too long, it is unable to meet the SLO of the tenants and produce a performance crisis. Therefore, how to balance the use of resources between nodes, improve the utilization of the nodes and eliminate the performance crisis of the nodes has become a problem that the suppliers pay more and more attention to. A series of research on the problem of overload in the household environment is carried out, and the resource dynamic balance mechanism is proposed to balance the use of data nodes and the performance crisis of eliminating nodes. The specific work and contributions of this paper are summarized as follows: 1. three load models are proposed and the construction process of the three load models is introduced in detail. Three load models are defined: the query load model, the tenant load model and the node load model to represent the load of the query, tenant and node respectively. First, the load model of the query is built according to the characteristics of the query, and the server resource is used as the base of the load model. The model of the load model and the node load model are constructed by linear addition method. Finally, the experiment verifies the accuracy of the load model.2. for the unbalance of data node resources using the unreasonable query scheduling strategy in the multi tenant environment. This paper proposes a dynamic load model based on the load model and the performance of the node. The performance of the node is closely related to the resource consumption level of the node. In order to monitor the performance of the node in real time, this paper trains the node's resource consumption based on the resource usage in the node, and then uses the dynamic query adjustment strategy to detect and count all data nodes in real time. Performance labels, based on query load model and node performance label, dynamically allocate queries, select appropriate node execution on the data nodes located in the tenant replica, balance the use of resources between data nodes, improve the utilization of node resources, and cause the performance of.3. in the multi tenant environment due to the tenant bushes' work load. In this paper, a lightweight load balancing mechanism for eliminating node performance crisis is proposed in this paper. In the multi tenant database environment, the node produces a performance crisis, which makes the response time of the query too long and the SLO violation rate is too high. This paper proposes a role of exchanging the main copy and auxiliary copy of the tenant to eliminate the performance danger. A lightweight load balancing mechanism for a machine. The principle of eliminating the performance crisis is that the tenant has a master copy and multiple auxiliary replicas, the master copy and the auxiliary copy can bear a read-only query, but the write query can only be on the master copy. So the master replica takes more work load than the auxiliary copy. If a node is overloaded, the The main copy of the tenant on the node is exchanged with the auxiliary copy of the tenant on other nodes, and the query on the node can be transferred to other nodes. The load of the node is reduced and the performance crisis is eliminated. This paper validates the effectiveness of the dynamic balance mechanism of the resource, and the dynamic balance mechanism of the resource can balance the data node. The use of resources and quickly and effectively eliminate the node performance crisis.
【学位授予单位】：山东大学
【学位级别】：硕士
【学位授予年份】：2017
【分类号】：TP393.09

【相似文献】