Introduction
Cloud computing is a prevailing computing paradigm, which provides computer hardware and software resources as a service to users over the Internet [1]. Generally, cloud computing can be divided into Infrastructure as a Service (IaaS), Platform as a Service (PaaS) and Software as a Service (SaaS) according to the type of service [2]. Cloud computing technology can optimize computing resources and provide flexible extensions. Any user can buy large amounts of computing resources to complete complex business demands at a low cost. However, the number and size of global data centers continue to increase to meet the growing business demands, so the energy consumption also increases dramatically [3]. For example, the average power consumption of a data center is equivalent to the power consumption of 25,000 households [4]. The issue of energy consumption has received widespread attention. It not only causes high operating costs but also leads to enormous damage to the ecological environment. Besides, cloud service providers should also ensure that users can enjoy reliable Quality of Service (QoS). The cloud service provider will be punished for violating the Service Level Agreements (SLA) specifying QoS objectives [5]. Therefore, how to reduce energy consumption while ensuring fewer SLA violations has become a big challenge for cloud service providers.
Virtualization technology is one of the critical technologies of cloud computing, which enables multiple virtual machines to share one physical machine [6]. If all the virtual machines are packed into a few physical machines, the energy consumption in the data center will be significantly reduced [7]. Over the past few years, researchers have focused on Virtual Machine Placement (VMP) problem to reduce energy consumption by optimizing the placement. VMP is intended to ensure QoS and decrease energy consumption by building a reasonable mapping between multiple virtual machines and physical machines. The problem of VMP can be divided into static placement and dynamic placement [8]. Static placement refers to creating virtual machines on suitable physical machines and dynamic placement is a process of migrating the running virtual machine from one physical machine to another [9]. We focus on the dynamic virtual machine placement, which is also called dynamic virtual machine consolidation, to reduce energy consumption. However, placing an excessive number of virtual machines on the same physical machine will result in substantial SLA violations [10] and poor user experience. Therefore, we should pay more attention to the QoS requirements of users in the process of placing energy-aware virtual machines.
The two main concerns of virtual machine consolidation are determining the source and destination hosts of the migration and how to migrate virtual machines from source host to destination host. Some existing studies typically rely on static thresholds or dynamic thresholds to determine the host state by comparing current utilization with thresholds. Other studies use prediction techniques such as linear regression [11] and machine learning [12] to predict the host state in the next scheduling cycle to determine which hosts need to be migrated. Because Extreme Learning Machine (ELM) is accurate, fast, and generalized [13], we use ELM to predict host usage. The VMP problem is NP-hard [14], [15]. Some researchers put forward linear programming to solve virtual machine consolidation [16], [17]. The linear programming method is simple and can obtain accurate optimal solution, but poor in scalability. When the problem scale increases, the calculation time will increase greatly. Therefore, linear programming is not suitable to deal with NP-hard problems. Some researchers proposed heuristic algorithms to get approximate optimal solutions [18]–[20], but the heuristic algorithm may fall into the local optimal easily. Bio-inspired meta-heuristic algorithms, such as Ant Colony System (ACS), Genetic Algorithm (GA) and Particle Swarm Optimization (PSO), can not only avoid falling into a locally optimal solution but also get high-quality approximation in a reasonable time in dealing with NP-hard problems [21], [22]. However, some meta-heuristic algorithms such as GA and PSO are designed for continuous problems originally, and they require special encoding for combinatorial optimization problems. Therefore, we choose ACS which is designed for discrete problems. What’s more, the ACS algorithm is a kind of swarm intelligence algorithm, which can be parallelized to speed the process.
To the best of our knowledge, some researchers used ACS [21], [23] to solve the virtual machine consolidation problem. However, the execution time is too long due to the large search space. Therefore, it is necessary to optimize the search space further. As a result, we propose an algorithm based on ELM and ACS called ELM_MPACS in this paper. First, the ELM algorithm is employed to predict underloaded and overloaded hosts. Then the multi-population ant colony system is applied to consolidate virtual machines. Finally, local search strategy and pheromone exchange rule are adopted to optimize the solution. The main contributions of this paper are as follows:
We present an ELM-based prediction algorithm. First, multiple ELM models are trained to reduce the error caused by random initialization. Then we choose the one with the least verification error for prediction.
We propose a virtual machine consolidation algorithm with lower complexity based on ACS. We dynamically choose the destination host for the virtual machine to reduce the complexity, avoiding the overhead of the pre-constructed migration tuples. Multiple populations construct concurrently and their respective migration schemes are optimized by the local search strategy. The pheromone exchange rule between different populations is used to increase the pheromones of the excellent combinations.
Different from the current work dealing with the virtual machines indiscriminately, we propose a new virtual machine migration strategy. The virtual machines on overloaded hosts are migrated to normal hosts, while the virtual machines on underloaded hosts are migrated to other underloaded hosts with higher utilization.
We conduct experiments on the real dataset of the CloudSim platform and compare ELM_MPACS with other algorithms. The experimental results have shown that our proposed algorithm effectively reduces energy consumption, the number of migrations and SLA violations.
The remainder of the paper is organized as follows. Section 2 reviews and discusses the relevant researches about energy consumption in the data center. We detail the power model, objective function and ELM prediction algorithm in Section 3. In Section 4, we present our ELM_MPACS algorithm. We describe experimental settings and analyze our experimental results in Section 5. The conclusions are in Section 6.
Related Work
The primary issue we should address about virtual machine consolidation is to decide the source and destination hosts of the migration. The host load changes dynamically in the data center. Lower usage can result in a waste of resources, while higher usage is prone to lead to SLA violations. Therefore, determining the source hosts for migration in the data center is the first step to solve the problem. Double static thresholds are adopted in [2]. The algorithm divided the hosts into three types, including the overloaded hosts, underloaded hosts and normal hosts. If the utilization of the host exceeded the upper threshold, it was overloaded. Below the lower threshold, it was underloaded. However, the method based on static threshold cannot adapt well to the dynamic changes of load in the data center. Beloglazov and Buyya [20] proposed the method of dynamic threshold based on historical utilization to detect overloaded hosts and finally migrated virtual machines on overloaded hosts to other hosts to implement virtual machine consolidation. Yadav and Zhang [24] proposed an algorithm that adjusted the upper threshold dynamically. The algorithm was employed to minimize the value of residual and not directly influenced by the outlier. Yadav et al. also [25] proposed two adaptive methods which were based on robust regression to set dynamic upper threshold. The first algorithm used gradient descent to minimize the cost function to get the global optimal, and the second algorithm was based on the idea that the host with the maximum correlation coeffient between the virtual machines was more likely to be overloaded. Zhou et al. [26] proposed an adaptive three-threshold method, which used the K-Means clustering algorithm to further subdivide the hosts into four types, which can better adapt to the dynamic changes of host load in the data center.
However, these papers only determine the overloaded host and the underloaded host by comparing the current utilization with the thresholds and do not predict the host state in the next period. If we can shut down the underloaded host in advance or migrate the virtual machines on the overloaded host beforehand, we will reduce energy waste and avoid SLA violations in the data center. Accurate prediction algorithms can prevent some unnecessary virtual machine migrations and the existing prediction algorithms proposed by researchers are mainly divided into two categories. One is based on linear regression [11], [27] while the other is based on machine learning like K-NNR [12] and ANN [28]. Linear regression methods can only capture linear features and machin learning methods such as neural networks can build nonlinear models flexibly but consume much more time [29], [30]. Compared with the traditional neural network, the ELM [13], [31] randomly initializes the weight and bias between the input and the hidden nodes, so the execution speed is fast. Besides, the generalization ability of ELM is also outstanding. There are a large number of physical machines in the data center that need to be scheduled periodically, so an accurate and efficient prediction algorithm is required. In addition, the load of the host in the data center changes dynamically, and migrations also cause load changes on the source and destination hosts. Linear prediction algorithms cannot predict this trend well, so we use an algorithm based on ELM to predict the host state in the next period.
Another critical issue that we should address is to determine the migration method. Researchers have proposed many approaches to migrating virtual machines from source hosts to appropriate destination hosts. Because the virtual machine consolidation problem is NP-hard, we divide these methods into two categories. One is non-meta-heuristic algorithms including linear programming and heuristic algorithm, and the other is meta-heuristic algorithms.
Linear programming or integer programming [16], [17] is one of the earliest proposed approaches to solving the VMP problem. This kind of algorithm is simple and can get the optimal solution, but it is challenging to get the optimal solution in a reasonable time when the scale of the problem increases. First Fit (FF) and Best Fit (BF) are well-known heuristic algorithms. Anand et al. [18] compared Integer Linear Programming (ILP) and First Fit Decreasing (FFD) algorithms considering the energy consumption caused by virtualization and migration. Murtazaev and Oh [19] proposed the SERCON method based on FF and BF, which considered minimizing the number of active hosts and migrations. However, most of the papers aforementioned only considered CPU or memory and rarely took into account the factor of the bandwidth. Lago et al. [32] noticed the impact of network bandwidth on performance and considered bandwidth resources in the heterogeneous network during the process of scheduling virtual machines. It shortened migration time by allocating bandwidth rationally and reduced energy consumption ultimately. Zhu et al. [33] considered three resources including CPU, memory and bandwidth. Besides, the authors designed different algorithms for virtual machine allocation, scheduling and optimization to minimize energy consumption. These two articles considered bandwidth resources and established more accurate and realistic models which reduced SLA violations and energy consumption in the data center. The traditional heuristic algorithms are easy to fall into the local optimal when dealing with the NP-hard problem. However, meta-heuristic algorithms have significant advantages in solving such issues; hence, many meta-heuristic algorithms have been applied to virtual machine consolidation.
Li et al. [34] considered the upper and lower thresholds of the CPU and hard disk resources. The authors applied the improved PSO to the problem of virtual machine consolidation to avoid falling into the local optimal. However, the above paper did not employ prediction methods and might cause unnecessary migrations. Chou et al. [35] put forward a resource allocation strategy based on PSO as well and employed the least-square method to predict resource utilization in the next period. GA was also applied to reduce energy consumption in the data center [36], [37]. Unlike paper [36], which used intelligent algorithms for resource allocation, [37] employed GA as a prediction algorithm to predict the state of the physical machine in the next period. A new meta-heuristic algorithm named salp swarm optimization was introduced in [38], which imitated the behavior of salp swarm. However, the author used real number encoding. When it converted real number into integer number, it could lead to a loss of accuracy. Li et al. [39] proposed an virtual machine consolidation method based on discrete Differential Evolution (DE). However, the author only considered the energy consumption and host overloading risk but ignored the number of migrations, which is of importance in the real data center.
However, some algorithms such as GA, PSO and DE are originally designed to solve continuous optimization problems, so the discretization is required for solving combinatorial optimization problems, which may result in a loss of accuracy. Ant colony optimization, a meta-heuristic algorithm, is designed for discrete optimization problems without special encoding [40], and it has also been applied to virtual machine consolidation problem. Farahnakian et al. [23] proposed an ACS-based virtual machine consolidation approach, aiming at maximizing the number of dormant physical machines and minimizing the number of virtual machine migrations. The ant traversed all possible tuples which are constructed as (source host, virtual machine, destination host) to yield an approximate optimal solution. Aryania et al. [21] extended [23]. The authors took into account the number of dormant physical machines and viewed the size of memory during VM migration as an essential factor. Ashraf and Porres [41] also improved [23]. It assigned different priorities for maximizing the number of dormant hosts and minimizing the number of migrations. These two independent populations were intended to optimize different goals. Moreover, the authors added neighborhood constraints, which asked the migration to occur in the neighborhood, to reduce the search space.
However, although the scope of the source and the destination hosts is narrowed by adding constraints in the paper [21], [23], it is still considerable. Since the ant will traverse all tuples predefined to find the destination host for each virtual machine, substantial tuples will be constructed when there are abundant underloaded hosts or overloaded hosts in the data center, which will significantly increase the search space. Moreover, the complexity of traversing tuples is higher than that of traversing only the destination hosts. In [41], the neighborhood constraints can narrow the search space, but it may make it unable to get the global optimal. To narrow the search space, reduce the complexity and get global optimal, we design a new ACS algorithm. The ant dynamically traverses the possible destination hosts for each virtual machine, instead of traversing all the tuples previously constructed or adding neighborhood constraints. Thus our proposed method narrows the search space, reduces the complexity and obtains the global optimal.
Problem Formulation
We need to do some preparations before consolidating virtual machine. A power model of the physical machine is established at first, which is the basis for evaluating the power consumption in the data center. Then we need to determine the objective function of the virtual machine consolidation problem, and we formulate the virtual machine consolidation problem into a multi-objective function, including migration times and energy consumption. Finally, we find out the overloaded hosts and the underloaded hosts to migrate the virtual machines. Therefore, a prediction module that classifies the hosts is also necessary. We will detail these three sub-modules in this section.
A. Power Model
The energy consumption components of a physical machine is composed of CPU, memory, hard disk and network communication components, among which CPU is the main energy-consuming component [2], [23], [42]. Therefore, we can use the energy consumption of the CPU to estimate the energy consumption of the whole system. In order to calculate the current power consumption more accurately, it is necessary to build a suitable power model. Because we use the CPU utilization to estimate the power consumption of the system, we need to study the power consumption relationship between CPU and physical machine. The existing papers [43], [44] indicate that there is a robust linear relationship between CPU utilization and the power consumption of the physical machine. Therefore, we establish the following linear power model.\begin{equation*}P_{j}=P_{j}^{min}+u *\left ({P_{j}^{max}-P^{ min}_{j}}\right), \tag{1}\end{equation*}
\begin{equation*}P_{j}=P_{j}^{max}*u+P_{j}^{max}*k*(1-u), \tag{2}\end{equation*}
From equation (1), we can conclude that when the CPU utilization is 0, the idle host still consumes lots of power. Moreover, the paper [2] pointed out that idle power accounts for 70% of peak power. Therefore, if the idle physical machine can be shut down in time, the energy consumption will be significantly saved.
B. Objective Function
It is one of our primary motivations to reduce energy consumption for virtual machine consolidation, while the number of migrations is another factor that cannot be ignored. The performance of the physical machine will degrade due to excessive migrations, thereby affecting the user experience. If the virtual machines on the underloaded physical machines cannot be migrated in time, the hosts running at a low load mode will waste resources and increase energy consumption. If the virtual machines on the overloaded physical machines cannot be removed in time, continuous SLA violations will occur. Therefore, migration times have a significant impact on energy consumption and SLA violations. In this paper, we establish a multi-objective function considering both energy consumption and the number of migrations. Whenever constructing a migration plan, we calculate the ratio of the current power of the host to the maximum power of the host and then obtain the cumulative sum. So the final objective function is as follows:\begin{equation*}{f}=\max \left ({\frac {1}{Mig}+\frac {1}{\sum _{j=1}^{M} \frac {P_{j}}{P_{j}^{max}}}}\right), \tag{3}\end{equation*}
\begin{equation*}{f}=\max \left ({\frac {1}{Mig}+\frac {1}{\sum _{j=1}^{M} {k(1-u)+u} }}\right), \tag{4}\end{equation*}
C. Elm Prediction Algorithm
ELM is a single hidden layer feed-forward neural network. Different from the traditional neural networks, ELM is characterized by its random initialization of the bias and weight between the input layer and the hidden layer [13]. ELM is fast to train and more generalized. There are a large number of physical machines need to be periodicly scheduled in the data center, so an accurate prediction algorithm is necessary, and training speed is also required to be fast enough. Herein, we employ ELM to predict the utilization of the host in the next scheduling period. Firstly, several ELM models are trained with one part of the historical utilization, and then another part of the data is used to validate the models. The ELM model, with the least validation error, is used to predict the usage of the host in the next period. In this paper, the number of output neurons is set to one, instead of multiple output neurons for a classification problem.
We take an ELM with K hidden layer nodes as an example. Suppose there are \begin{equation*}\sum _{j=1}^{K} {\mathrm \beta }_{j} g\left ({\mathbf {w}_{j} \cdot \mathbf {x}_{i}+{b}_{j}}\right)={z}_{i},\quad {i}=1,2, \ldots,{m}, \tag{5}\end{equation*}
\begin{equation*}\sum _{i=1}^{m}\left \|{{z}_{i}-{y}_{i}}\right \|=0, \tag{6}\end{equation*}
\begin{equation*} \sum _{j=1}^{K} g\left ({\mathbf {w}_{j} \cdot \mathbf {x}_{i}+{b}_{j}}\right) {\beta }_{j}={y}_{i},\quad {i}=1,2, \ldots,{m}. \tag{7}\end{equation*}
\begin{equation*} \mathbf {A} \boldsymbol {\beta }=\mathbf {Y}, \tag{8}\end{equation*}
\begin{align*} A=&\left [{\begin{array}{lll}{g\left ({\mathbf {w_{1}} \cdot \mathbf {x_{1}}+b_{1}}\right)} &\quad {\cdots } &\quad {g\left ({\mathbf {w_{K}} \cdot \mathbf {x_{1}}+b_{K}}\right)} \\ {} &\quad {\vdots } &\quad {\ddots } \\ {g\left ({\mathbf {w_{1}} \cdot \mathbf {\mathbf {x_{n}}}+b_{1}}\right)} &\quad {\cdots } &\quad {g\left ({\mathbf {w_{K}} \cdot \mathbf {x_{n}}+b_{K}}\right)}\end{array}}\right], \qquad \tag{9}\\ \boldsymbol \beta=&\left [{\begin{array}{c}{\beta _{1}} \\ {\vdots } \\ {\beta _{K}}\end{array}}\right],\quad \mathbf {Y}=\left [{\begin{array}{c}{y_{1}} \\ {\vdots } \\ {y_{n}}\end{array}}\right]. \tag{10}\end{align*}
The weight and bias between the input layer and the hidden layer are obtained by random initialization in ELM. Therefore, if the input data is determined, only the weight vectors of the hidden layer to the output layer are unknown. So we can get the weight vectors by solving the above equation. The inverse matrix of the singular and non-square form does not exist, but we can obtain a pseudo-inverse matrix of them. The pseudo-inverse matrix \begin{equation*} \boldsymbol {\beta }=\mathbf {A}^{\dagger } \mathbf {Y}, \tag{11}\end{equation*}
According to extensive experiments, we take three ELMs in this paper. We set the number of hidden layer nodes to five and use sine function as activation function. Moreover, we take ten samples with three dimensions, of which 7/10 are used for training, 3/10 are used for verification. The network of the ELM is shown in Fig. 1.
We use ELM to detect overloaded and underloaded hosts and the detection algorithm is shown in Algorithm 1. The paper [23] set the threshold to 0.5 and 1.0, and the paper [11] set the threshold to 0.1 and 0.9. A large upper threshold will lead to frequent overload and severe SLA violations, while a small upper threshold will result in a waste of resources. If the lower threshold is too large, a large number of virtual machines will be migrated. If the lower threshold is too small, a large number of hosts with low usage will not be shut down in time. In order to obtain reasonable thresholds, we have experimentally verified that the upper and lower thresholds are set to 0.8 and 0.3, respectively. We can make a tradeoff between energy consumption and SLA violations in this interval. We execute the scheduling algorithm every five minutes. During the first ten periods, the current utilization of the host is used to determine the host state (line 1-3). When we collect enough data, we use ELM to predict host utilization (line 5). If the host utilization is greater than 0 and does not exceed 0.3, the host is considered to be underloaded (line 7-8). If the host utilization is greater than 0.3 and does not exceed 0.8, the host is the normal host (line 9-10). If the utilization of the host exceeds 0.8, it is considered to be overloaded (line 11-12).
Algorithm 1 Hosts State Dectection
hostList
underloadedHostList, normalHostList, overloadedHostList
for host in hostList do
if utilizationHistory.size ≤ 10 then
utilization = host.getUsedMips/host.getTotalMips
else
utilization = ELM(host)
end if
if utilization > 0 and utilization ≤0.3 then
underloadedHostList.add(host)
else if utilization > 0.3 and utilization ≤0.8 then
normalHostList.add(host)
else if utilization > 0.8 then
overloadedHostList.add(host)
end if
end for
Multi-Population ACS Based on ELM (ELM_MPACS)
Based on the preliminary work proposed in Section 3, we propose a multi-population ant colony system algorithm based on ELM (ELM_MPACS). The general idea of our ELM_MPACS algorithm is: firstly, we determine the underloaded hosts, normal hosts and overloaded hosts according to the ELM prediction model, and then apply the multi-population ACS to assign the destination host for the virtual machine to be migrated. According to the objective function (4), we evaluate each population’s scheme and finally get the best solution. In this section, we first introduce the various parts of the virtual machine consolidation algorithm we proposed, and then show the complete ELM_MPACS algorithm and analyze its complexity finally. Fig. 2 shows the relationship between these parts.
A. Definition of Pheromone and Heuristic Information
In the real world, the pheromone is a kind of chemical substance that ants communicate with each other by it and ants find the source of food along the way by sensing other ants’ pheromones [45]. The pheromone on the combination of a virtual machine and a physical machine in virtual machine consolidation represents the ant’s favorability. Ants are more likely to choose the combination with a higher value of pheromone. The more pheromone of the combination accumulates, the more it has been selected in the previous iterations, meaning that it is a right choice. \begin{equation*} \tau _{0}=\frac {1}{N}, \tag{12}\end{equation*}
In addition to the pheromone, heuristic information is another essential factor in ACS. The heuristic information
The heuristic for the overloaded hosts is defined as \begin{equation*} \eta _{i, j}=\begin{cases}{1-\dfrac {P U_{j}+V R_{i}}{P C_{j}}},&\quad {{~\text {if }} P U_{j}+V R_{i} \leq P C_{j}} \\ {0}, &\quad {{~\text {otherwise }}},\end{cases} \tag{13}\end{equation*}
\begin{equation*} \eta _{i, j}=\begin{cases}{\dfrac {P U_{j}+V R_{i}}{P C_{j}}}, &\quad {{~\text {if }} P U_{j}+V R_{i} \leq P C_{j}} \\ {0},&\quad {{~\text {otherwise }}},\end{cases} \tag{14}\end{equation*}
In the above formulas,
B. Pseudo-Random-Proportional Rule and Pheromone Updating Rules
The ant prefers to choose the combinations with the largest product of the pheromone and the heuristic information. However, to avoid falling into local optimum, ants choose combinations based on pseudo-random-proportional rule. The pseudo-random-proportional rule can be expressed as follows \begin{equation*} \mathrm {r}=\begin{cases}{\arg \max \left \{{\left [{\tau _{i, j}^{\alpha }}\right] \cdot \left [{\eta _{i, j}^{\beta }}\right]}\right \}}, &\quad {\text {if } q \leq q_{0}} \\ \text {R}, &\quad {\text {otherwise}},\end{cases} \tag{15}\end{equation*}
\begin{equation*} p_{i, j}=\frac {\left [{\tau _{i, j}^{\alpha }}\right] \cdot \left [{\eta _{i, j}^{\beta }}\right]}{\sum _{PM_{j} \in \Theta }\left [{\tau _{i, j}^{\alpha }}\right] \cdot \left [{\eta _{i, j}^{\beta }}\right]}, \tag{16}\end{equation*}
We select the destination host for each virtual machine according to the pseudo-random-proportional rule. During the process of migrating virtual machines on underloaded hosts, if
After selecting the destination host according to the pseudo-random-proportion rule for the current virtual machine, the pheromone of the combination (\begin{equation*} \tau _{i, j}=(1-\rho) \cdot \tau _{i, j}+\rho \cdot \tau _{0}, \tag{17}\end{equation*}
After all the ants in the population have built their plans, we will select the best one from each population according to (4). Then the fitness of the schemes of all the populations are compared, and the final scheme \begin{equation*} \tau _{i, j}=(1-\rho) \cdot \tau _{i, j}+\rho \cdot f(S^{+}), \tag{18}\end{equation*}
C. Local Search Strategy and Pheromone Exchange Rule
Although each population has already built its solution, there is no guarantee that every active host in the solution will be load balanced. If a heavily loaded host exchanges some virtual machines with a low-loaded host, then overload can be avoided. So we propose a local search strategy to achieve load balancing and reduce SLA violations. The basic idea of the local pheromone exchange rule is that after each population has built its migration plan, some exchange operations are performed within the respective schemes. The process of local search is shown in Algorithm 2. If the termination iteration condition is not reached, each time we randomly select two combinations (
Algorithm 2 LocalSearch
for (i = 1; i < N*N; i++) do
randomly choose (
compute utilization difference1 between hosts
exchange the destination hosts
compute utilization difference2 between hosts
if difference2 < difference1 then
return (
else
return (
end if
end for
Local search has achieved a certain degree of avoidance of SLA violations. However, the local search does not produce a diversity of solutions. Moreover, the populations we introduced operate independently and do not cooperate with each other, so each population cannot learn valuable information from each other until now. Therefore, we design the pheromone exchange rule to allow one population to learn the excellent combinations from another, and retain these combinations as much as possible in future iterations. The pheromone exchange rule is shown in Algorithm3: we select a combination (
We can implement load balancing in the datacenter by the local search strategy. Moreover, the pheromone exchange strategy enables to preserve excellent combinations, increasing the probability of being selected in subsequent iterations. Therefore, we will optimize the migration plan and avoide SLA violations.
D. Virtual Machine Consolidation Algorithm
We have described all the components of our algorithm in detail so far, and we will explain the complete process of our proposed ELM_MPACS algorithm below. Our ELM_MPACS algorithm uses different placement rules for virtual machines on the overloaded and underloaded hosts. The virtual machines on the overloaded hosts will be migrated to the normal hosts with more available resources, while the virtual machines on the underloaded hosts will be migrated to the underloaded hosts with higher usage.
Algorithm 4 shows the pseudo-code for virtual machine consolidation. The algorithm inputs are various host lists obtained by Algorithm 1 and virtual machine list to be migrated, which are consisted of all virtual machines on the underloaded host and the virtual machines obtained from the overloaded host according to the principle of minimum migration time. Every population generates its scheme concurrently for each iteration, and every ant builds a scheme in the population (line 1-3), where
Algorithm 3 Exchange Pheromone
for combination1 in
for combination2 in
if combination1== combination2 then
temp = combination1.pheromone + combination2.pheromone;
combination1.pheromone = temp;
combination2.pheromone = temp;
end if
end for
end for
Algorithm 4 Proposed ELM_MPACS Algorithm
overloadedHostList, normalHostList, underloadedHostList, VMListToMigrate,
S
for i
for j
for k
while VMListToMigrate!
if VM.sourceHost is overloadedHost then
compute heuristic using (13)
else
compute heuristic using (14)
end if
generate a random variable
if
choose a destination host using (15)
else
choose a destination host using (16)
end if
if VM.sourceHost is overloadedHost then
update resources
if utilization of destination host < utilization of source host then
add (
update local pheromone using (17)
else
restore resources
end if
else if utilization of sourceHost < utilization of destionation host then
add (
update local pheromone using (17)
update resources
end if
end while
compute the score of
end for
choose the best score among the T
local search using Algorithm 2
end for
exchange pheromone using Algorithm 3
update global pheromone using (18)
end for
To avoid the load of destination host excessive capacity caused by virtual machine migration, we ask the utilization of the destination host cannot be higher than that of the source host after the migration (line 16-23). By contrast, the virtual machine from the underloaded host can only be migrated to the destination host with higher usage than that of the source host (line 24-28). By adding this constraint, the underloaded hosts with lower usage can be shut down more quickly, while those with higher usage will restore to normal ones gradually. It avoids unnecessary migrations and speeds up the migrating process at the same time.
As shown in Algorithm 4, we can conclude that the time complexity of the algorithm is
Experimental Setup and Results Analysis
A. Experiment Environment
We demonstrate the advancement and effectiveness of our proposed ELM_MPACS algorithm in the CloudSim toolkit [46]. CloudSim is a scalable cloud simulation platform that supports energy-aware computing resources modeling and custom virtual machine consolidation methods. Our simulation experiments use two different specifications of dual-core physical machines with 400 for each type. Our simulated data center is also configured with four kinds of virtual machines of varying frequency. Table 1 and Table 2 show the specifications of various types of physical machines and virtual machines in the data center, including CPU frequency, bandwidth and memory.
The dataset we used is from the CoMon system, which monitors the operation of the PlanetLab infrastructure and collects data from each node of the PlanetLab [47]. Table 3 shows details about the dataset. As shown in Table 3, the dataset recorded about 1000 virtual machines for ten days from March and April in 2011. To prove that our virtual machine consolidation algorithm is suitable for large-scale data centers, we conducted experiments on the data of 20110322 because the number of virtual machines on this day is the most in the ten days. In our experiment, the scheduling period is set to five minutes, and the algorithm will be scheduled 288 times in 24 hours. The parameter settings in our proposed algorithm are shown in Table 4. We obtain these parameters through abundant experiments. Also,
B. Evaluation Metrics
The total Energy Consumption (EC) of the data center is an essential metric for evaluating an algorithm. A host consumes different power at various CPU utilization, and the power specifications of the varying type of hosts under the same CPU utilization are also different. We use the power data given in SPECpower,1 which is shown in Table 5. From Table 5, we can see that the power consumption of the host in the idle state still occupies over 70% of the peak power consumption. The number of migrations (NM) refers to the total counts of migrations for each scheduling period during the entire running time. In real-world scenarios, virtual machine live migrations can cause poor user experience, so it is necessary to reduce the number of migrations. The number of migrations is expressed as \begin{equation*} \mathrm {NM}=\sum _{k=1}^{T} Mig_{k}, \tag{19}\end{equation*}
Performance Degradation due to Migration (PDM) is used to measure the impact of migration on host performance. Like the setting in [20], the estimator of the virtual machine performance degradation is set to the 10% CPU utilization. PDM is calculated as \begin{equation*} \mathrm {PDM}=\frac {1}{N} \sum _{i=1}^{N} \frac {C_{d}^{i}}{C_{r}^{i}}, \tag{20}\end{equation*}
PDM per migration is used to measure the average PDM for all migrations. PDM is calculated as \begin{equation*} \mathrm {PDM ~per~migration}=\frac {\mathrm {PDM}}{\mathrm {NM}}, \tag{21}\end{equation*}
SLA violation Time per Active Host (SLATAH) is the average ratio of the overload time to the total time of all active hosts, which indicates the overload situation of the entire data center. If the total resources requested by the virtual machine exceed the resources that the host can provide, the host without available resources is deemed as overloaded. SLATAH is formulated as \begin{equation*} \mathrm { SLATAH }=\frac {1}{M} \sum _{j=1}^{M} \frac {T_{o}^{j}}{T_{a}^{j}}, \tag{22}\end{equation*}
SLA violations (SLAV) occur when cloud service providers are unable to guarantee the quality of service in accordance with SLA. SLA violations will not only result in a decline in the user experience but also lead to punishment for cloud service providers. [20] proposed a measure of SLA violation consisting of SLATAH and PDM, which are independent of each other and equally important. The smaller the SLAV value, the smaller the number of SLA violations, and the better the QoS. The formula is as follows \begin{equation*} \mathrm {SLAV}=\mathrm {SLATAH} * \mathrm {PDM}. \tag{23}\end{equation*}
EC and SLAV only show the performance of different algorithms from two aspects, and cannot comprehensively evaluate the performance of different algorithms. The ESV can weigh the energy consumption and SLAV, which is an objective evaluation of the pros and cons of different algorithms. ESV is defined as \begin{equation*} \mathrm {ESV}=\mathrm {EC} * \mathrm {SLAV}. \tag{24}\end{equation*}
The ESV is proportional to these two values, and any metric increases will increase the ESV value. A smaller ESV value shows a better trade-off between energy consumption and the SLAV.
C. Analysis of Results
Our proposed algorithm ELM_MPACS is compared with four benchmark algorithms in CloudSim on the dataset “20110322”. The host overload detection algorithms used by four algorithms are the Inter Quartile Range (IQR), the static threshold (THR), the Median Absolute Difference (MAD) and the Local Regression (LR), while the virtual machine selection algorithm uses Minimum Migration Time algorithm (MMT) [20]. The parameters attached to each name are security parameters, which determine the degree of virtual machine consolidation. We also compared ELM_MPACS with the latest MSE_MMT_4.0 algorithm [11] and ACS_VMC [23] to prove the advancement of our proposed algorithm. The formal is based on the heuristic algorithm, using Mean Square Error (MSE) as a correction of linear regression prediction; the latter is based on the ACS, employing the linear regression. Besides, to prove the validity of our ELM prediction module, we compared it with the algorithm ST_MPACS without ELM, where the upper and lower thresholds are set to 0.8 and 0.3, respectively. We run 20 experiments independently and take the mean as the final results. Table 6 shows our experimental results. From Fig. 3 to Fig. 9, the performance of these algorithms is compared in detail with each metric introduced in the previous section.
Fig. 3 shows that the two algorithms we propose have great advantages in terms of energy consumption. Compared with IQR_MMT_1.5, THR_MMT_0.8, MAD_MMT_2.5, LR_MMT_1.2, MSE_MMT_4.0 and ACS_VMC, ELM_MPACS reduced the energy consumption by 31.45%, 31.12%, 29.91%, 21.28%, 15.81% and 4.09% on the “20110322”, respectively. Because energy consumption is a major factor in our proposed multi-objective function, we selects the one that minimizes energy consumption from multiple populations. IQR_MMT_1.5, THR_MMT_0.8, MAD_MMT_2.5 and LR_MMT_1.2 only set the upper threshold, while our algorithm employs double thresholds to deal with the virtual machines on the underloaded hosts, so the underloaded hosts are quickly shut down to reduce energy consumption. In addition, ACS_VMC handles the underloaded and the overloaded hosts in a unified manner, which cannot shut down the underloaded hosts in time, resulting in a waste of resources. Besides, we design different algorithms for the features of underloaded and overloaded hosts. We migrate a virtual machine from the underloaded host with lower utilization to the underloaded host with higher utilization, so that the host with lower utilization can be quickly shut down. The host with a higher utilization gradually recovers to the normal host. Therefore, our proposed algorithm will greatly reduce energy consumption compared with other algorithms. The virtual machine consolidation algorithm ultimately determines the energy consumption, and the prediction algorithm directly affects the migration times and SLAV, so ELM_MPACS with ELM consumes almost the same energy as ST_MPACS without ELM. As can be seen from Fig. 3, our algorithms ELM_MPACS and ST_MPACS have the lowest energy consumption among all algorithms.
As shown in Fig. 4, we have fewer migrations than MSE_MMT_4.0 because the ELM prediction algorithm is more accurate than the robust linear regression with error correction, which avoids unnecessary migrations. Although robust linear regression has error correction, it cannot capture the nonlinear changes of the load in data center. In contrast, the ELM algorithm predicts the host load more flexibly, thus avoiding unnecessary migrations. The number of migrations of ELM_MPACS is smaller than that of ACS_VMC. Because ACS_VMC deals with the overloaded and underloaded hosts indiscriminately and traverse all the pre-constructed tuples, it results in a large number of unnecessary migrations. We migrate the virtual machines on the overloaded host to the normal host and migrate the virtual machines on the underloaded to another underloaded host, thus greatly reducing unnecessary migrations. Because ST_MPACS does not use the prediction algorithm, it cannot accurately predict the state of the host in the next period. Moreover, some virtual machines on the host that are currently overloaded or underloaded but will restore to the normal state in the next period are unnecessarily migrated. Therefore, our proposed ELM_MPACS has a smaller number of migrations than ST_MPACS.
We can conclude that our algorithm ELM_MPACS performs better than other algorithms on the PDM from Fig. 5. Compared with IQR_MMT_1.5, THR_MMT_0.8, MAD_MMT_2.5 and LR_MMT_1.2, our ELM_MPACS reduces by 92.67%, 93.28%, 92.70% and 94.29%, respectively. Furthermore, our algorithm reduces by 38.24% than MSE_MMT_4.0. Because our algorithm has the smallest number of migrations, the performance degradation caused by migration can be minimized. This is because PDM is a measure of the impact of migration on host performance, and the number of migrations directly affects the value of PDM. The previous analysis of the number of migrations in Fig. 4 shows that our algorithm significantly reduces the number of migrations, which greatly reduces the impact of migration on host performance. Therefore, the value of PDM of ELM_MPACS is minimal. And Fig. 6 shows the PDM per migration, which reflects the average migration cost. We can see that the values of PDM per migration of our proposed ST_MPACS and ELM_MPACS are less than that of other algorithms. We migrate the virtual machine from one underloaded host to another underloaded host with higher utilization, and migrate the virtual machine from the overloaded host to the normal host, which efficiently avoids fierce competition for resources. After the migrations, the destination hosts are not easily overloaded. Therefore, our virtual machine consolidation methods have the least migration costs and perform better than others.
Fig. 7 depicts the performance of each method on the SLATAH. In our algorithm, the virtual machine on the overloaded host preferentially selects a normal host with more available resources, and the algorithm asks that the utilization of the destination host is under that of the source host after the migration. After the virtual machine is migrated from the source host to the destination host, the source host will restore from the overload state to the normal state. Because the destination host has enough available resources, it will not overload due to migration. Also, our local search algorithm re-optimizes the placement scheme, rebalance the load between different hosts and avoid the risk of overloading the destination host as much as possible due to the migration, so our ELM_MPACS can obtain a smaller value of SLATAH. Our prediction algorithm migrates some of the virtual machines on the host that might be overloaded in advance to avoid SLA violations. The ST_MPACS algorithm does not predict the host state in advance, so some virtual machines are migrated to the hosts that may be overloaded in the next period, which will aggravate the overload. Therefore, our ELM_MPACS algorithm performs better on SLATAH than any other algorithm.
SLAV is directly proportional to PDM and SLATAH, which jointly determine SLAV. According to the previous analysis, the values of PDM and SLATAH of our ELM_MPACS algorithm both are the smallest, so SLAV value is also the smallest. We can see that the SLAV of ELM_MPACS is only 3.16% of ACS_VMC compared with ACS_VMC from Fig. 8. Our ELM_MPACS algorithm can effectively reduce the number of migrations, identify overloaded hosts and migrate virtual machines on them in time, so our algorithm is indeed the best among these algorithms. ESV evaluated the performance of EC and SLAV comprehensively and is a more objective and all-round metric for each algorithm. From Fig. 9, we see that our ELM_MPACS algorithm is far better than IQR_MMT_1.5, THR_MMT_0.8, MAD_MMT_2.5 and LR_MMT_1.2 after considering energy consumption and SLAV. ACS_VMC focuses on reducing power consumption, and MSE_MMT_4.0 prefers to reduce SLAV. However, our algorithm consumes less energy than ACS_VMC and obtains a smaller SLAV value than MSE_MMT_4.0. Moreover, we have decreased by 96.97% and 68.43% on ESV, respectively, compared with ACS_VMC and the latest MSE_MMT_4.0.
The experimental results show that our proposed ELM_MPACS effectively reduces the energy consumption in the data center, migration times, and SLAV. Moreover, the precise prediction of the ELM can avoid invalid virtual machine migrations and reduce SLA violations. The local search strategy optimizes the migration plan and further reduces the SLA violation, and the pheromone exchange between the populations retains some excellent combinations to some extent. Multiple populations construct schemes concurrently, allowing us to choose the best solution. Compared with the heuristic algorithm and the meta-heuristic algorithm, the experimental results prove the advancement and effectiveness of our algorithm.
Conclusion and Future Work
In this paper, we have proposed a virtual machine consolidation algorithm based on ELM and ACS. Firstly, we build multiple ELM models to predict the host state for the next period. We used multi-population ACS algorithms with local search to select the destination host for the virtual machine one by one, which reduces the search space. Moreover, we analyzed the features of underloaded hosts and overloaded hosts and designed different heuristic information and migration rules. In order to prove the advancement of our proposed ELM_MPACS, we compared it with other seven algorithms of IQR_MMT_1.5, THR_MMT_0.8, MAD_MMT_2.5, LR_MMT_1.2, MSE_MMT_4.0, ACS_VMC and ST_MPACS on the CloudSim platform. We conclude that the prediction algorithm ELM can accurately predict the state of the host in the next scheduling cycle. The ELM can avoid unnecessary migrations and reduce SLA violations effectively according to the experiment results. The adoption of the local search algorithm further optimizes the solution, balances the loads between different hosts and reduces SLA violations. The pheromone exchange between populations accumulates the pheromone concentration of the excellent combinations, increasing the likelihood that the combination will be selected next time. Each time, the best solution is selected from multiple populations, which increases diversity and ensures convergence. Therefore, our algorithm effectively reduces energy consumption, migration times and SLA violations. In future, we want to apply our scheduling algorithm to real-world data centers to reduce energy consumption.