Multi-objective Task Scheduling in Cloud Environment using Decision Tree Algorithm

In recent years, Cloud computing has been developed and become the foundation of a wide range of applications. It allows users to access a catalog of standardized services and respond to their business needs flexibly and adaptively, in the event of unforeseen demands, paying solely for the consumption they have made. Task scheduling problem is considered one of the most critical cloud computing challenges. The problem refers to how to reasonably order and allocate the applications tasks provided by the users to be executed on virtual machines. Furthermore, the quality of scheduling performance has a direct effect on customer satisfaction. The task scheduling problem in cloud computing must be more accurately described in order to improve scheduling performance. In this paper, a multi-objective task scheduling algorithm is proposed based on the decision tree. We introduce a new Task Scheduling-Decision Tree (TS-DT) algorithm for allocating and executing an application’s task. To evaluate the performance of the proposed TS-DT algorithm, a comparative study was conducted among the existing algorithms; Heterogeneous Earliest Finish Time (HEFT), Technique for Order of Preference by Similarity to Ideal Solution that incorporates the Entropy Weight Method (TOPSIS-EWM), and combining Q-Learning with the Heterogeneous Earliest Finish Time (QL-HEFT). Our results show that the proposed TS-DT algorithm outperforms the existing HEFT, TOPSIS-EWM, and QL-HEFT algorithms by reducing makespan by 5.21%, 2.54%, and 3.32%, respectively, improving resource utilization by 4.69%, 6.81%, and 8.27%, respectively, and improving load balancing by 33.36%, 19.69%, and 59.06%, respectively in average.


I. INTRODUCTION
Cloud computing is a modern computer technology that employs virtualized infrastructure to provide secure and reliable services to end-users in a complicated environment. Because it provides essential information technology (IT) services such as computing resources in the form of virtual machines (VMs), cloud computing has gotten a lot of attention as a computing model [1,2].
Unfortunately, Cloud computing has some challenges such as performance, resource management, cost, etc. [3]. On the other hand, task scheduling on cloud computing is the allocation of users' tasks on the available resources to optimize the execution time, enhance load balancing, and increase resource efficiency. Task scheduling depends on the existence of dependencies among the tasks. The problem of scheduling dependent tasks in a heterogeneous environment has drawn many attention of researchers in this area. Among the most studied area is the Directed Acyclic Graph (DAG), which is the most common graph that shows the dependability of the application's tasks. This dependent task scheduling is also called DAG scheduling.
In DAG scheduling, the workflow is presented by = (T), where T = {t1, t2...,} is the set of tasks and = { 1, 2...,} is the set of edges. Each task ti ∈ T denotes an application task, and each E ( , ) ∈ represents the communication cost between the independent tasks, where task ti is should be executed before task tj. A structure of the DAG of some workflow types is shown in Figure 1.  [7] Scheduling and allocation of the applications' tasks are considered Non-deterministic Polynomial (NP)-Complete problems [8]. Therefore, optimization approaches are used to solve these problems by considering performance parameters such as makespan, load balance, resource usage, cost, power consumption, etc., to solve these problems [9].
Machine learning is currently being used in various fields such as speech recognition, data classification, and face recognition [10]. It has played a significant role in task scheduling and several other areas in computer science technology. Among the popular machine learning techniques used for developing and visualizing predictive models and algorithms is the decision tree.
In this paper, a Task Scheduling-Decision Tree (TS-DT) algorithm, which is a task scheduling algorithm based on the decision tree is introduced. The performance of the proposed task scheduling is evaluated using load balance parameters such as makespan, resource utilization, and power consumption in a heterogeneous environment. By the work in this paper, the following significant contributions are satisfied: § A new task scheduling algorithm based on the decision tree method for scientific workflows is proposed. § Extensive simulation tests on various performance metrics are used to evaluate the proposed algorithm.
The rest of this paper is organized as follows; Section 2 discusses the related works. In Section 3, the principles of the proposed algorithm are described. Section 4 illustrates the proposed algorithm in detail. The CloudSim simulator's configuration and the proposed Decision Tree Algorithm are discussed in Section 5. The comparative study among the proposed algorithm and other existing algorithms like HEFT, TOPSIS-EWM, and QL-HEFT are discussed in Section 6. Finally, the conclusion and future work are given in Section 7.

II. RELATED WORK
Task scheduling in distributed, parallel, and cloud computing environments have become an important research topic. Its purpose is to ensure an effective distribution of computing resources to provide high performance. In traditional distributed and parallel computing environments, a set of scheduling strategies has been proposed by many researchers.
K. Naik et al. [11] have described a new hybrid multiobjective heuristic method that combines Non-dominated Sorting Genetic Algorithm-II (NSGA-II) and Gravitational Search Algorithm (GSA) called as NSGA-II & GSA to assist with VM selection for application scheduling. NSGA-II has the ability to expand the search space through exploration, while GSA has the ability to exploit the good solution to discover the best solution and so avoid the algorithm getting trapped in local optima. This hybrid algorithm is designed to achieve the fastest response time and lowest cost for scheduling a larger number of tasks with the minimum total energy consumption. Unfortunately, there is no load balancing between VMs.
S.Pang and colleagues [12] have developed a hybrid scheduling algorithm based on the Estimation of Distribution Algorithm (EDA) and Genetic Algorithm (GA). The algorithm initially generates some feasible solutions using EDA operations, then utilizes GA operations to generate new solutions based on the great solutions selected in the previous phase to expand the search range of solutions, and finally selects the best solution. The purpose of this technique is to reduce the task completion time and enhance load balancing. However, this paper does not consider the dynamics and uncertainties of the cloud computing environment.
S.H.H. Madni et al. [13] have presented a novel Multiobjective Cuckoo Search Optimization (MOCSO) technique for dealing with the cloud computing resource scheduling problem. The goal of this technique is to explore the multiobjective resource scheduling problem in an Infrastructure as a service (IaaS) cloud computing environment by maximizing resource consumption. The load balancing across VMs is considered a big flaw in this technique.
Y.Q. Han and Q. Li. Jun [14] have solved the flexible task scheduling problem in a cloud system by proposing a hybrid discrete Artificial Bee Colony (ABC) algorithm. The suggested algorithm includes three categories of artificial bees; employed, onlooker, and scout bees, as in the classical ABC algorithm. The proposed ABC algorithm reduces completion time and improves balancing machine loads. One of this algorithm's major flaws is resource utilization.
Avinash Kaur et al. [15] have proposed a new workflow scheduling scheme by integrating the Deep Q-learning mechanism and the HEFT algorithm is called DQ-HEFT. The scheme is considered the most common heuristic VOLUME XX, 2017 1 scheduling in literature. The algorithm consists of two phases: gaining the task's execution order at each stage and allocating the task to the processor with a significantly higher volume of data. It is worthy to note that the DQ-HEFT algorithm can achieve better makespan and speed. However, the main drawback of the DQ-HEFT algorithm is that excessive value updates of the Q-table are performed in large-scale task optimization problems, which slow down the scheduling process.
A. Al-maamari and F. Omara [16] have proposed a task scheduling algorithm for the cloud computing environment. The algorithm is considered a fusion of the Cuckoo Search (CS) algorithm and the Dynamic PSO (DAPSO) algorithm, which has been modified to increase the population. According to this algorithm, tasks are assigned to virtual machines (VMs) to minimize makespan and maximize resource uses. However, there is no load unbalancing between VMs.
Arabi Keshk et al. [17] have introduced Modified Ant Colony Optimization for Load Balancing (MACOLB) algorithm to allocate the incoming jobs to the virtual machines (VMs). The tasks are allocated to the VMs based on the processing powers (i.e., tasks are allocated in descending order, starting from the most powerful VM, and so on) by considering balancing VMs' loads. The MACOLB is used to find the proper resource allocation for batch tasks in the cloud system, minimize the makespan, and achieve better system load balance. However, the resource utilization between VMs is considered a crucial flaw of this algorithm.
Jae-Min Yu1 and colleagues [18] have proposed a decision tree-based method for scheduling flexible workshops with multiple process plans. For static and dynamic flexible job shops, two decision tree-based scheduling mechanisms were created. All jobs were provided in advance in the static case, and the decision tree is used to select a priority dispatching rule to process all of them. In the dynamic scenario, jobs arrive over time. The decision tree is used to select a priority rule in real-time according to a rescheduling strategy using a decision tree that is modified regularly. The objectives considered in this method are makespan, total flow time, and total delay, but the load balancing between VMs was not considered. Liu Yuan, Dong Yinggang, etc. [19] have proposed a static HEFT task scheduling algorithm, called ST-HEFT. The algorithm consists of two key steps; task sorting and task mapping. According to the sorting step, tasks are sorted based on the maximum communication cost between them and their direct VMs. The task mapping step is assigned to the VM that provides the earliest execution time. The proposed algorithm has achieved better performance by reducing the development threshold for parallel computing programs and increasing the utilization of various computing devices' capabilities in the heterogeneous computing environment. On the other hand, load balancing and sleek time are the critical weaknesses of this algorithm. Sambit Kumar Mishra et al. [20] have suggested an Adaptive Task Allocation (ATAA) algorithm in the cloud environment. This algorithm uses the Expected Time to Completion (ETC) matrix to solve the heterogeneous environment problem, including completing all tasks on VMs. The author uses a technique that reduces energy consumption and minimizing the makespan of the system. Also, the major weakness of this algorithm is the load balancing between VMs.
Atyaf Dhari et al. [21] have proposed a cloud computing environment load balancing decision algorithm, called (LBDA), to enhance load balancing among virtual machines and reduce makespan. The algorithm consists of three steps. The first step is used to calculate the VM's capacity and load (under-full VM, balanced VM, high-balance VM, and overloaded VM). In the second step, for each VM, the required time is determined to execute the task. In the third step, based on the VM state and the task time, a decision is made to spread tasks. Unfortunately, resource utilization between VMs is considered a critical weakness of this algorithm.
Zeshan Iqbal et al. [22] have proposed an algorithm called Parental Prioritization Earliest Finish Time (PPEFT) for a heterogeneous distributed environment. The algorithm consists of two phases; the prioritization of the tasks and the processor's assignation. First, the tasks are scheduled in the Parental Priority Queue (PPQ) based on the descending Rank and parental priority in the task prioritization phase. Then, the Processor Assigning Phase assigns each task in the PPQ queue to a processor that guarantees fast execution (i.e., minimum computation cost). Experimentally, the PPEFT scheduling algorithm performs substantially better concerning cost and schedule makespan than other algorithms. Unfortunately, load balancing between VMs is a critical weakness of this algorithm.
S.C. Sharma et al. [23] have modified the HEFT algorithm to effectively distribute the workload between processors and effectively reduce completion time. This algorithm analyzes various algorithms for the task scheduling, parameters, tools, improvement, and algorithm limitations. This algorithm reduces makespan and improves load balancing by comparing it to the existing HEFT and the Critical Path on a Processor Algorithm (CPOP) [24] algorithms. The critical weakness of this algorithm is the sleek time.
In this paper, the decision tree has been used to optimize the multi-objective task scheduling problem by minimizing makespan, satisfying load balancing among virtual machines, and maximizing resource utilization.

III. Overview
In this paper, we proposed a task scheduling algorithm based on the decision tree for a heterogeneous cloud environment has been proposed. Also, we evaluated the performance of the proposed algorithm through a comparative study among the HEFT, TOPSIS-EWM, and QL-HEFT algorithms, which are widely used algorithms for task scheduling in the cloud computing environment. Based VOLUME XX, 2017 1 on the above-mentioned goals, we now discuss the principles of the existing algorithms in the following sections.

A. The Heterogeneous Earliest Finish Time Algorithm
In the HEFT algorithm [25], the tasks presented in the DAG are scheduled to a series of heterogeneous machines. This algorithm consists of two phases; ranking and processor selection phases. The goal of the ranking phase is to provide a priority for each task. The Processor Selection phase concerns about allocating each task to a suitable processor. This phase will be repeated until all tasks will be scheduled for the available processors [26]. In the ranking phase, the upward (ranking) function is used to define the priority of each task which is defined recursively by using Equation (1) [27]: where Wi is the average of the computation cost of the task ti, ci, j is the average of the communication cost between the edges from ei to ej, and succ (ti) is the set of successors of the task ti. It's important to remember that Rank (ti) is determined by the computation of all its children's Rank (tj).
In the "processor selection" phase, tasks are sorted in descending order according to their rank values. Then, the processors assign tasks by selecting the processor with the shortest finish time for each task. However, the HEFT algorithm always considers the processor with the earliest finish time to allocate tasks, but it does not consider load balancing among the processors [28].

B. A Multi-Criteria Decision-Making Approach
A Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) workflow scheduling algorithm in a cloud environment has been introduced that incorporates the Entropy Weight Method (EWM) called (TOPSIS-EWM) [29]. The proposed algorithm aims to reduce makespan, cost, and energy consumption while increasing reliability. According to the ROPSIS-EWM algorithm, EWM is used to determine the input weight of the attributes schedule length (EFT), cost, reliability, and energy consumption. The TOPSIS approach is then used to choose the optimal virtual machine for each task. The research takes into consideration a cloud environment with dynamic voltage scaling (DVS) and pay-per-use heterogeneous VM instances. The MIPS of the VM instances is used in the simulations directly proportional to VM pricing. The (TOPSIS-EWM) algorithm does not consider any user preferences, such as deadlines, load balance, and resource utilization.

C. QL-HEFT Algorithm
Authors in [30], proposed a novel task scheduling algorithm that reduces the makespan by combining Q-Learning with the HEFT algorithm is called (QL-HEFT). The algorithm uses the upward rank value of HEFT as the immediate reward. In the Q-learning framework, the agent can obtain better learning results through the self-learning process to update the Q-table. The QL-HEFT algorithm is divided into two phases: the task sorting phase and the processor allocation phase. The task sorting phase uses Qlearning to find the best order of the tasks, while the processor allocation phase uses the earliest finish time strategy. However, the authors discovered that using the QL-HEFT algorithm to solve large-scale task optimization problems has some drawbacks, such as an excessively large Q-table that causes long update times.

D. A Decision Tree Definition
A decision tree is a hierarchical data structure that uses a divide-and-conquer technique to represent data [31]. On the other hand, the decision tree is a rooted tree having leaf and non-leaf nodes. The decision criteria for classification and regression trees distinctly depend on the decision tree. Meanwhile, a decision tree is a rooted tree with leaf and nonleaf nodes. The leaf nodes represent the classification or decision-making, whereas the non-leaf nodes represent the selection options by dividing the instance space into two or more subspecies based on a discrete function of input attribute values (See Figure 2) [32]. The decision tree is a popular method for creating and visualizing predictive models and algorithms [34,35]. As explained earlier, the static approach uses the decision tree to select a priority rule combination to process the set of given tasks, i.e., no rule changes over the scheduling horizon. Hence, it can be used for planning purposes. The static decision tree-based mechanism suggested in this study is shown in the next section.

IV. Problem Statement
Multi-objective optimization has a significant impact on scheduling issues in the cloud computing environment. We covered three relatively approaches for resolving task scheduling performance, each has its own set of restrictions: VOLUME XX, 2017 1 • When allocating tasks, the HEFT method always considers the processor with the earliest finish time, but it ignores load balancing among the processors. • The (TOPSIS-EWM) algorithm considers the input weight of the attributes schedule length like (EFT), cost, reliability, and energy consumption. Then, the TOPSIS technique is used to select the best virtual machine for each task without considering any user preferences like deadlines, load balance, and resource utilization. • The QL-HEFT approach for solving large-scale task optimization issues has some limitations, such as a big Q-table that causes long update times that effect on the Makespan in the task schedule.

V. The Proposed Task Scheduling Decision Tree (TS-DT) Algorithm
In this section, we introduce the TS-DT algorithm to reduce the makespan, enhance load balance, and maximize utilization of the resource. The algorithm consists of three phases: the priority task, the resource matrix, and the resource allocation phase. First, the task priority phase is used to assign a rank for each task. The resource matrix phase is used in collecting the tasks' features in the form of a matrix, while the resource allocation phase is where tasks are scheduled on the proper VMs using the decision tree. The principles of each phase will be explained in the following sections.

A. Task Priority Phase
According to the task priority phase, a rank is assigned for each task in the given workflow (i.e., DAG). By considering tasks set T = {T1, T2, ..., Tn}. If Ti < Tn, then Ti is the parent of Tn. Equation (1) has been changed by adding task length (TL), which indicates the length of the instruction of a cloudlet (i.e., task) to be processed in the virtual machine (VM), and the number of child's (Nc) (See Equation (2)). Therefore, the Rank of each task in the given workflow is defined using Equation (2). (2) After assigning priority to each task, the tasks will be sorted in descending order according to their Rank value and stored in the Rank [T] list. As a result, the most important task will be executed first. The pseudo-code of the Task Priority phase is as follows: list.

B. Resource Matrix Phase
This phase is used to collect the features of the selected task from the Rank [T] list and store them in the task's matrix (T). In the T matrix, columns represent the number of needed resources, while rows represent four features for each task as follows: Feature 1: Computation cost (CP) of each task on each VM. This refers to the length of the instructions of tasks on VM. Feature 4: Showing the VM-based task parent (TP) (i.e., Parent location (0/1)), where one is typed if there is a parent for the task; otherwise, zero is typed. This feature considers the communication cost between tasks. For example, a structure of a task structure with its features is summarized in a matrix (T) by considering five VMs with four features, as shown in Figure 3.

C. Resource Allocation Phase
In this phase, the proper VMs are selected to execute tasks in the Rank [T] list, which contains the tasks which order in descending order according to their priorities.
For the task that plays a role in the Rank [T] list (i.e., the task with high priority), the decision tree is constructed to represent its features from its task matrix. In the case of leaf nodes, a test is done to check if the task's parent is on the same VM or not according to Feature 4. If the answer is "yes", the communication cost remains zero. In the case of "no", the communication cost is considered between the parent and the successor.
The output in the leaf nodes is the summation of the task's features in the task's matrix, which is defined using Feature 1 VOLUME XX, 2017 1 to Feature 3 (i.e., CP, EFT, and TTL) with considered the summation of the communication cost from the DAG workflow if the task's parent is not on the same VM. If the parent task in the same VM, it is compute using Equation (3). When the parent is not in the same VM, it is calculated using Equation (4). (TTL) is the total length of tasks assigned to each VM.
Finally, the task is assigned to the VM, which has the lowest value that will come out of the tree's leaf nodes (See Figure 4). The pseudo-code for the Resource Allocation phase is shown in Table 2. To explain how the proposed TS-DT algorithm works, a sample of the task graph and the computational task costs on each VM with considering 3 VMs are depicted in Figure (5a, b).

Phase 1: Task Priority Phase
By applying this phase, a rank is assigned to each task using Equation (2), and the tasks are sorted in descending order in Rank [T] (See Table 3).

Phase 2: Resource Matrix Phase
In this phase, the features of the selected task from the Rank [T] list will be collected and stored in the task's matrix by considering T1 in Figure 5, Feature2 (i.e., EFT), Feature3 (i.e., TTL), and Feature4 (i.e., TP) are considered zero on all VMs because it is the entry task. Therefore, the features of (T1) in its matrix are presented in Figure 6.

Phase 3: Resource Allocation Phase
In this phase, each task is assigned to a suitable VM based on its decision tree. The cost of communication value between tasks and their parent and successor will be assessed based on the matrix that constructs in Phase 2. According to the decision tree of task (T1), the summation value of VM3 is considered the lowest value (i.e., 0+10=10) (See Figure 7). So, VM 3 is assigned to execute the task (T1).
Phases two and three (i.e., Resource Matrix Phase and Resource Allocation Phase) are repeated for each task in the Rank[T] list until all tasks are assigned to VMs.
As a result, the entire workflow makespan is 413 milliseconds in both the task priority and resource allocation phases. The deviation of the load balance is zero because the term of the Resource Matrix phase considers the number of tasks on each VM. The resource utilization rate is 94.67 %, and it appears that the Resource Matrix phase works better. (See Figure 8).

VI. The Performance Evaluation
In this section, we discuss the performance matrices, experimental environment, and the benchmark used in evaluating the performance of the proposed TS-TD algorithm.

A. Performance Metrics
The performance matrices which are used to measure the efficiency of the task scheduling algorithms on the cloud computing environment includes the following:

Makespan:
It displays the maximum completion time of the schedule. This parameter is defined by Equation (5) [36].
Where Cti is the execution time of the longest task ti, and T is the number of the tasks on the workflow of an application.

Resource Utilization Rate (RU):
We define RU as the ratio of the total consumed time by VMs to the makespan of the parallel application. It is calculated as a percentage using Equation (6) [37].

(6) Load Balancing (LB):
This is the ratio of the total number of tasks to the number of VMs. We calculate LB using Equation (7) [38].

(7)
Busy time: It represents the task length indicates the instructions length of a cloudlet to be processed on each VMi. It is defined using Equation (8) [39]. (8) Where ålength is the summation of each task length.

Idle Time:
This is the difference between the total execution time and the busy time for each VMi. It is determined using Equation (9) [40].

Total Execution Time (TET):
It represents the summation between the busy time and idle time for each VMi. It is defined using Equation (10) [41].

Power Consumption:
The power consumption of VMi consists of two parts; busy power consumption and idle power consumption. It is calculated using Equation (11) [42].
Where Pbusy j is the busy power of the machines with 220W, Pidle j is the idle power of the machines with 95W, Tbusy is the

Improvement Rate (IRx):
Makespan, resource utilization, and load balance are factors (x) that will be considered to determine the performance improvement of the proposed TS-DT algorithm relative to the current HEFT, TOPSIS-EWM, and QL-HEFT. We calculate IRx using Equation (12) [43].
(12) Average Deviation: The ratio between the total summation of IRx for each VMi over their number to get the average deviation from the ideal rate in percentage is calculated using Equation (13) [44]. (13)

B. The Experimental Environment
The CloudSim 3.0.2 toolkit is an open-source simulator developed by WorkflowSim 1.0 on Windows 7 OS with a Core i7 2.70 GHz processor [45]. The CloudSim simulator is used to implement and evaluate the performance of the proposed TS-DT Algorithm. The Eclipse IDE 4.12.0 is used to run CloudSim 3.0.2. The used benchmarks are Montage_25, SIPHT_30, Cyber-Shake_30, and Epigenomics_24 workflows [46].

VII. Performance Analysis
To evaluate the performance of the proposed TS-DT algorithm, a comparative study was conducted among HEFT, TOPSIS-EWM, QL-HEFT algorithms, and the proposed TS-DT algorithm in terms of makespan, load balance, resource utilization, and power consumption. The tasks are considered dependent, and each task has different characteristics, such as length, id, start time, and finish time. The implementation was carried out with the consideration of 5, 10, 20, and 40 VMs using Montage_25, SIPHT_30, Cyber-Shake_30, and Epigenomics_24 workflows. We focus on the hypervolume indicator to measure the quality of a set of trade-off solutions [47].
According to the implementation results in Figure 9, it is found that the proposed TS-DT algorithm outperforms the HEFT, TOPSIS-EWM, and QL-HEFT algorithms. This is because of the following reasons: • During the Task Priority phase, the proposed TS-DT algorithm determines the proper task to be executed by increasing its priority while using the task length and number of childs. • During the Resource Allocation phase, the decision tree and the summation of the features in the task's matrix are used to select the VM with the lowest value. • Some features are used to enhance the makespan (i.e., computation cost, Earliest Finish Time (EFT), and parent location). The average improvement rate of makespan, in percent, of the proposed TS-DT algorithm compared to existing HEFT, TOPSIS-EWM, and QL-HEFT algorithms were determined using Equation (12) and presented in Table 4. According to the comparative results in Table 4, We found that the proposed TS-DT algorithm outperforms, in average, the default HEFT, TOPSIS-EWM, and QL-HEFT algorithms by approximately 5.21%, 2.54%, and 3.32%, respectively.

B. Resource Utilization
TS-DT, HEFT, TOPSIS-EWM, and QL-HEFT algorithms with respect to resource utilization using Montage_25, SIPHT_30, Cyber-Shake_30, and Epigenomics_24 workflows using 5, 10, 20, and 40 VMs are presented in Figure 10. Based on the comparison results in Figure 10 (a)-(d) (See Equation (6)), we confirmed that the proposed TS-DT algorithm outperforms the existing algorithms in terms of resource utilization for any number of VMs. This is possible because, during the Resource Matrix phase, the proposed TS-DT algorithm selects the minimum of the total length of tasks assigned to each VM. The total instruction's length of tasks on VM considers the devices' consumption rate.
The average improvement of the proposed TS-DT algorithm in terms of resource utilization in percentage in relation to the existing HEFT, TOPSIS-EWM, and QL-HEFT algorithms were determined using Equation (12) and presented in Table 5. According to our comparative results in Table 5 show that the proposed TS-DT algorithm outperforms, in average, the default HEFT, TOPSIS-EWM, and QL-HEFT algorithms by approximately 4.69%, 6.81%, and 8.27%, respectively.

C. Load Balancing
The load balancing of each VM is calculated by the ratio of the number of tasks and the total execution time in VMs (See Equation (7)). The implementation results of our  proposed TS-DT, HEFT, TOPSIS-EWM, and QL-HEFT  algorithms with respect to load balancing using Montage_25,  SIPHT_30, Cyber-Shake_30, and Epigenomics_24 workflows using 5, 10, 20, and 40 VMs are presented in Figure 11 from (a) to (d).
According to the comparison results in Figure 11 is found that the proposed TS-DT algorithm outperforms the existing other algorithms in terms of load balancing. This is because that during the Resource Matrix phase (Feature 3), the proposed TS-DT algorithm considers the length of the total task on each VM when assigning a new task to VM. Finally, the TS-DT select the minimum value. So, the phase is influenced by the device load at similar rates.
The average load balance improvement in the percentage of the proposed TS-DT algorithm relative to the existing HEFT TOPSIS-EWM, and QL-HEFT algorithms has been determined using Equation (13) and illustrated in Table 6. According to the comparative results in Table 6, it is found that the proposed TS-DT algorithm outperforms, in average, the default HEFT, TOPSIS-EWM, and QL-HEFT algorithms by approximately 33.36%, 19.69%, and 59.06%, respectively.

D. Power Consumption
In the heterogeneous cloud platform, the power consumption consists of two components; busy power consumption and idle power consumption. During the power processing operations, the busy power consumption is the energy consumed, and the idle power consumption is the energy consumed during the VM in the idle state [42]. In order to accurately calculate the power consumption, the work in this paper estimates the busy power consumption and idle power consumption of the VM, respectively. Then, the total power consumption of the VM is obtained using Equations (8, 9, 10, 11, 12, and 13). According to the implementation results, it is found that our proposed TS-DT algorithm increases power consumption relative to the HEFT, TOPSIS-EWM, and QL-HEFT algorithms, this is because the following reason: • The proposed TS-DT algorithm selects the minimum of CP and VM with EFT for the assigned task to VM during the Resource Matrix phase. This means that the proposed algorithm consumes the high available VMs specifications, which increases the power consumption rate of the devices. According to the implementation results in Table 7, it is found that the proposed TS-DT algorithm increases the power consumption relative to the default HEFT, TOPSIS-EWM, and QL-HEFT algorithms by approximately 12.89 %, 43.52 %, and 28.38 %, respectively. This is considered the main limitation of the proposed TS-DT.
With respect to makespan, resource utilization, and load balance, the performance of the TS-DT algorithm is better than that of HEFT, TOPSIS-EWM, and QL-HEFT algorithms in most cases, which is also confirmed by the hypervolume as shown in Figures (12, 13) for Montage and Epigenomics workflow. This is because the HEFT algorithm is a single-objective scheduling algorithm and does not consider other objectives such as load balance and resource utilization. At the same time, the TOPSIS-EWM algorithm is designed to select the best virtual machine for each task for multi-objective workflow scheduling. It doesn't taking into consideration any user preferences like deadlines, load balance, and resource utilization. In contrast, the QL-HEFT algorithm for solving large-scale task optimization issues has limitations, such as the Makespan task schedule.

VIII. Conclusions and Future Work
In this paper, a new task scheduling algorithm using multiobjective based on a decision tree, called TS-DT algorithm, is proposed for a cloud computing environment. The proposed TS-DT algorithm targets minimizing the makespan, enhancing load balance, and maximizing resource utilization. A comparative study was conducted to evaluate the performance of the proposed TS-DT algorithm relative to the HEFT, TOPSIS-EWM, and QL-HEFT algorithms. According to the comparative results, the proposed TS-DT algorithm outperforms the HEFT, TOPSIS-EWM, and QL-HEFT algorithms by reducing the average makespan by 5.21%, 2.54%, and 3.32%, respectively, improving the average resource utilization by 4.69%, 6.81%, and 8.27%, respectively, and improving the average load balancing by 33.36%, 19.69%, and 59.06%. The main limitation of our proposed TS-DT algorithm is the increase in power consumption has been increased by around 12.89%, 43.52%, and 28.38%, respectively, relative to the existing HEFT TOPSIS-EWM, and QL-HEFT algorithms.
More performance parameters could be concerned in future work, such as power consumption, fault tolerance, and scalability.