基于轮转机制的云服务在线更新技术研究
发布时间:2018-04-09 12:19
本文选题:在线更新 切入点:云平台 出处:《国防科学技术大学》2014年硕士论文
【摘要】:伴随着软件服务从企业计算模式走向云计算模式,传统“测试-发布-后期维护”的软件运维思路,已无法适应互联网开放环境下用户需求与系统资源的急速变化。相反,云服务持续在线的“监控-诊断-调整”质量提高过程,在实践中逐步被大众认可。作为实现“调整”环节的主要手段,软件服务的更新活动日益频繁,已经由过去以“季度”、“年”为单位的更新,进化为今天以“日”、“周”为单位的更新。面对如此现状,对于软件服务来说,如何完成从可更新,到易更新,再到热更新的转变,成为了学术界关注的重要研究课题,在线更新技术开始凸显其重要的现实意义。在现实生活中,为了使服务具有更好的弹性以应对未知的负载变化,大多数云服务采用多实例、集群化的方式部署。这种部署架构为云服务在线更新打开了新的思路:由于多个实例之间往往相互独立,且各实例均包含有较完整的业务逻辑,可正确的处理用户请求,部分实例暂时性的失效不会影响整体服务的正常提供。因此,可以引入轮转机制,在服务不中断的情况下对集群实例分批进行更新。基于这一假设,本文围绕云服务在线更新问题,进行了以下三方面的研究。一、提出了基于轮转机制的云服务在线更新框架,并基于GlassFish平台完成了该框架原型系统的设计与实现工作。通过逐次对集群中的实例以一定策略进行隔离、更新、使能的操作,确保整个更新过程中至少保留一个服务实例正常对外提供服务,达到了服务在线更新的要求。二、针对版本混合问题,以及为解决它而带来的系统性能损耗问题,提出基于延迟切换的在线更新策略,将整个更新过程分为两个阶段。首先,通过逐台更新实例的方式,延缓整个服务性能下降的趋势,直至服务能力降至原配置的一半,更新进入切换阶段,保证集群中运行新旧版本的实例之间的逻辑隔离。通过对电子商务服务Rubis进行实验,本文验证了该方法可达到更新期间99%+的服务可用性,并可将更新期间的服务平均响应速度提升至原有的1.4倍。三、针对服务集群内单个服务实例的更新时机确认问题,本文提出基于负载均衡的任务状态感知机制,通过为集群中的实例维护专属的任务状态感知列表,以其长度协助判定该实例能够进入更新的精确时间。实验结果显示,相较于目前预设等待时长的更新时机确认方法,采用该机制后,Rolling Upgrade、Split Mode Upgrade和本文所提出的Delayed Switch Upgrade等包含在基于轮转机制框架下的方法都能表现出更高的可用性。
[Abstract]:With software services moving from enterprise computing mode to cloud computing mode, the traditional idea of "test-release and post-maintenance" software operation and maintenance can no longer adapt to the rapid changes of users' needs and system resources under the open environment of the Internet.Instead, the continuous online monitoring-diagnostics-tuning process of cloud service quality improvement has been gradually accepted by the public in practice.As the main means to realize "adjustment", the updating activities of software services are becoming more and more frequent, which has evolved from the past "quarterly" and "year" update to today's "day" and "week" update.In the face of this situation, how to complete the transformation from renewable, easy to update, and then to hot update has become an important research topic in academic circles, and online update technology has begun to highlight its important practical significance.In real life, in order to make services more flexible to cope with unknown load changes, most cloud services are deployed in a multi-instance, clustered manner.This deployment architecture opens up new ideas for online updates of cloud services: because multiple instances are often independent of each other and each instance contains more complete business logic, it can handle user requests correctly.Temporary invalidation of some instances will not affect the normal provision of the overall service.Therefore, rotation mechanism can be introduced to update cluster instances in batches without interruption of service.Based on this assumption, this paper focuses on the online updating of cloud services in the following three aspects.Firstly, an online update framework for cloud services based on rotation mechanism is proposed, and the design and implementation of the prototype system based on GlassFish platform are completed.By isolating and updating the instances in the cluster one by one, the operation can ensure that at least one service instance is kept in the whole updating process to provide service normally, which meets the requirement of online service update.Secondly, aiming at the problem of version mixing and the system performance loss caused by it, an online update strategy based on delay switching is proposed, which divides the whole update process into two stages.Firstly, by updating the instances one by one, the trend of service performance decline is delayed until the service capability is reduced to half of the original configuration, and the update enters the switching stage to ensure the logical isolation between the instances running the new and old versions in the cluster.Through the experiment of e-commerce service Rubis, this paper verifies that this method can achieve 99% service availability during the update period, and can increase the average response speed of the service during the update period to 1.4 times of the original service.Thirdly, aiming at the problem of updating opportunity confirmation for a single service instance in a service cluster, this paper proposes a task state awareness mechanism based on load balancing, which maintains a specific task state awareness list for the instance in the cluster.Its length helps determine the exact time that the instance can enter the update.The experimental results show that, compared with the current renewal timing confirmation method with preset waiting time, the proposed methods such as Rolling upgrade split Mode Upgrade and Delayed Switch Upgrade, which are included in the framework of rotation mechanism, can show higher availability.
【学位授予单位】:国防科学技术大学
【学位级别】:硕士
【学位授予年份】:2014
【分类号】:TP393.09
,
本文编号:1726418
本文链接:https://www.wllwen.com/guanlilunwen/ydhl/1726418.html