Skip to Main Content
Energy consumption in large-scale distributed systems, such as computational grids and clouds gains a lot of attention recently due to its significant performance, environmental and economic implications. These systems consume a massive amount of energy not only for powering them, but also cooling them. More importantly, the explosive increase in energy consumption is not linear to resource utilization as only a marginal percentage of energy is consumed for actual computational works. This energy problem becomes more challenging with uncertainty and variability of workloads and heterogeneous resources in those systems. This paper presents a dynamic scheduling algorithm incorporating reinforcement learning for good performance and energy efficiency. This incorporation helps the scheduler observe and adapt to various processing requirements (tasks) and different processing capacities (resources). The learning process of our scheduling algorithm develops an association between the best action (schedule) and the current state of the environment (parallel system). We have also devised a task-grouping technique to help the decision-making process of our algorithm. The grouping technique is adaptive in nature since it incorporates current workload and energy consumption for the best action. Results from our extensive simulations with varying processing capacities and a diverse set of tasks demonstrate the effectiveness of this learning approach.