Skip to Main Content
The resources in parallel computer systems (including heterogeneous clusters) should be allocated to the computational applications in a way that maximizes some system performance measure. However, allocation decisions and associated performance prediction are often based on estimated values of application and system parameters. The actual values of these parameters may differ from the estimates; for example, the estimates may represent only average values, the models used to generate the estimates may have limited accuracy, and there may be changes in the environment. Thus, an important research problem is the development of resource management strategies that can guarantee a particular system performance given such uncertainties. To address this problem, we have designed a model for deriving the degree of robustness of a resource allocation-the maximum amount of collective uncertainty in system parameters within which a user-specified level of system performance (QoS) can be guaranteed. The model is presented and we demonstrate its ability to select the most robust resource allocation from among those that otherwise perform similarly (based oh the primary performance criterion). The model's use in allocation heuristics is also demonstrated. This model is applicable to different types of computing and communication environments, including parallel, distributed, cluster; grid, Internet, embedded, and wireless.