Skip to Main Content
Cluster management has become a multi objective task that involves many disciplines like power optimization, fault tolerance, dependability and online system operation analysis. Efficient and secure operation of these clusters is a key objective of any data center policy. In addition, the service provided by these servers must fulfill a level of quality of service (QoS) to the customers. Applying self-management techniques to these clusters would simplify and automate its operation. Current self-management techniques that take into account service level agreements (SLAs) do not cover at the same time the two most important sides of the cluster operation: self-optimization, for an efficient and profitable operation, and self-healing, for a secure operation and high level of quality of service perceived by users. This work integrates a self-optimization strategy for Internet server clusters that optimizes the power consumption, using dynamic provisioning of servers, with a self-healing strategy that improves the reaction of the cluster to a server failure, by using the spare capacity of the cluster intelligently. The self-management technique is based on empirical response time and power consumption models of the servers that simplify its operation. Additionally, the technique presented in this paper guarantees the fulfillment of the SLA.