Skip to Main Content
This paper gives an overview of the material to be discussed in the invited keynote presentation by H. J. Siegel. Performing computing and communication tasks on parallel and distributed systems involves the coordinated use of different types of machines, networks, interfaces, and other resources. Decisions about how best to allocate resources are often based on estimated values of task and system parameters, due to uncertainties in the system environment. An important research problem is the development of resource management strategies that can guarantee a particular system performance given such uncertainties. We have designed a methodology for deriving the degree of robustness of a resource allocation - the maximum amount of collective uncertainty in system parameters within which a user-specified level of system performance (QoS) can be guaranteed. Our four-step procedure for deriving a robustness metric for an arbitrary system will be presented. We will illustrate this procedure and its usefulness by deriving robustness metrics for some example distributed systems.