Skip to Main Content
Shared computing utilities allocate compute, network and storage resources to competing applications on demand. An awareness of the demands and behaviors of the hosted applications can help the system to manage its resources more effectively. This paper proposes an active learning approach that analyzes performance histories to build predictive models of frequently used applications; the histories consist of measures gathered from noninvasive instrumentation on previous runs with varying assignments of compute, network and storage resources. An initial prototype uses linear regression to predict application interactions with candidate resources and combines them to forecast completion time for a candidate resource assignment. Experimental results from the prototype show that the mean forecasting errors range from 1% to 11% for a set of batch tasks captured from a production cluster. Examples illustrate how a system can use the learned models to guide task placement and data staging.