Skip to Main Content
We present here mechanisms and models for building autonomically scalable and resilient services on wide-area shared computing platforms in which resources at a node are allocated to competing users on fair-share basis. There is no platform-wide resource manager for the placement of users on different nodes. Building scalable services in such environments poses unique challenges due to fluctuations in the available resource capacities and node crashes. The service load may surge in a short time due to flash crowds. We present here models for estimating the service capacity under varying operating conditions. Autonomic scaling of service capacity is performed by dynamic control of the degree of service replication based on the estimated service capacity and the observed load. Furthermore adaptive load distribution mechanisms are needed because of the varying service capacities of the individual replicas. We present the results of our evaluations of these mechanisms on Planet Lab.