Date Published: April 26, 2019
Publisher: Public Library of Science
Author(s): Danqing Feng, Zhibo Wu, DeCheng Zuo, Zhan Zhang, Yang Li.
Elasticity is the key technique to provisioning resources dynamically in order to flexibly meet the users’ demand. Namely, the elasticity is aimed at meeting the demand at any time. However, the aforementioned approaches usually provision virtual machines (VMs) in a coarse-grained manner just by the CPU utilization. Actually, two or more elements are needed for the performance metric, including the CPU and the memory. It is challenging to determine a suitable threshold to efficiently scale the resources up or down. In this paper we present an elastic scaling framework that is implemented by the cloud layer model. First we propose the elastic resource provisioning (ERP) approach on the performance threshold. The proposed threshold is based on the Grey relational analysis (GRA) policy, including the CPU and the memory. Secondly, according to the fixed threshold, we scale up the resources from different granularities, such as in the physical machine level (PM-level) or virtual machine level (VM-level). In contrast, we scale down the resources and shut down the spare machines. Finally, we evaluate the effectiveness of the proposed approach in real workloads. The extensive experiments show that the ERP algorithm performs the elastic strategy efficiently by reducing the overhead and response time.
Cloud computing is popular in industry due to its ability to deliver on-demand resources according to a pay-as-you-go model . Usually, three basic service models are included in cloud computing: Infrastructure as a Service (IaaS) , Platform as a Service (PaaS)  and Software as a Service (SaaS) . Namely, SaaS provides access to complete applications as a service. PaaS provides a platform for developing other applications on top of it, such as the Google App Engine (GAE) and Azure. IaaS provides an environment to deploy the managed virtual machines. Technically, when the users submit the requests, the providers would provide the resources depending on the users’ demand [5–6]. As a key technique in cloud computing, the elasticity  has the ability to acquire and release the resources according to the users’ demand.
Usually the elastic solution is implemented by scaling the resources in or out. By analyzing some related works, we would divide the elastic resource provisioning approaches into two major aspects, including automatic scaling methods  and elastic mechanisms on the predictive technique .
In this section, we present our proposed approach for the detailed description. Our approach is designed on the cloud layer model. That is, this policy is implemented to determine the performance threshold to flexibly scale the resources up or down. Additionally, the formulation of the performance threshold is presented in detail in the next section. Then the ERP framework is explained in detail in the following.
In this section, we present a performance threshold on multiple elements. From this we would rapidly scale the resources up or down in cloud computing.
In this section we describe the ERP algorithm to scale the resources from different granularities according to the users’ demand.
In this section, we implement the elastic resource allocation strategy based on the performance criterion. Meanwhile, the proposed approach proves that it is appropriate for meeting the demand in different kinds of workloads. In addition, this approach considers both reducing the renting cost and improving the utilization.
Traditional elasticity is often used as a reactive method, which is implemented by the rule-condition-action. However, it would be a better strategy to combine this with the prediction. In this paper, we present an elastic strategy that increases or decreases the resources by the performance threshold in a flexible manner. To further elaborate, the ERP approach makes the following contributions. First, we present the performance threshold depending on the CPU and the memory. By this, we could flexibly scale the resources up or down. This solves the issue of deciding a suitable threshold on multiple elements. Second, we propose an SUS algorithm that implements the fine-grained scaling in the PM-Level or VM-Level to increase the resources flexibly. This solves the issue of an elastic scaling strategy from different granularities to reduce the SLA violation and response time. Third, combining this with the WMA prediction we propose the SDS algorithm to scale down the servers. Then we would shut down the spare machines to save energy consumption. This solves the issue of effectively saving the overheads. Finally, we evaluate the proposed ERP approach in the simulated and real-world workloads. The results show that the ERP method improves the utilization, minimizes the renting cost, saves the energy consumption and gives a quicker response time.