Skip to Main Content
In cloud-based Web application hosting environments, virtualization offers the potential to exploit dynamic resource provisioning and scaling to maintain service level agreements while minimizing resource utilization for a given workload. However, optimal proactive resource provisioning and scaling for a specific Web application require, at the least, a profile of the application's current workload and a model of the application's capacity under various resource configurations. Here we focus on multi-tier Web applications. The capacity of a multi-tier Web application varies substantially as the pattern of requests in the workload changes. In this paper, we propose and evaluate a black-box method for capacity prediction that first identifies workload patterns for a multi-tier Web application from access logs using unsupervised machine learning and then, based on those patterns, builds a model capable of predicting the application's capacity for any specific workload pattern. In an experimental evaluation, we compare a baseline method that predicts capacity without a model of the application-specific workload patterns to several regression models using the proposed workload identification method. All of the models based on workload pattern identification outperform the baseline method. The best model, a Gaussian process regression model, gives only 6.42% error. Cloud providers utilizing our method can proactively perform dynamic allocation of resources to multi-tier Web applications, meeting service level agreements at minimal cost.