Skip to Main Content
Application classification techniques based on monitoring and learning of resource usage (e.g. CPU, memory, disk and network) have been proposed to aid in resource scheduling decisions. An important problem that arises in application classifiers is how to decide which subset of numerous performance metrics collected from monitoring tools should be used for the classification. This paper presents an approach based on a probabilistic model (Bayesian Network) to systematically select the representative performance features, which can provide optimal classification accuracy and adapt to changing workloads. Virtual machines (VMs) are used to host the application execution and system-level performance metrics for a VM summarize the application and its host's resource usage. This approach requires no application source code modification nor execution intervention. Results from experiments show that the proposed scheme can effectively select a performance metric subset providing above 90% classification accuracy for a set of benchmark applications.