Skip to Main Content
In this paper, we examine the problem of balancing load in a large-scale distributed system when information about server loads may be stale. It is well-known that sending each request to the machine with the apparent lowest load can behave badly in such systems, yet this technique is common in practice. Other systems use round-robin or random selection algorithms that entirely ignore load information or that only use a small subset of the load information. Rather than risk extremely bad performance on one hand or ignore the chance to use load information to improve performance on the other, we develop strategies that interpret load information based on its age. Through simulation, we examine several simple algorithms that use such load interpretation strategies under a range of workloads. Our experiments suggest that by properly interpreting load information, systems can: 1) match the performance of the most aggressive algorithms when load information is fresh relative to the job arrival rate, 2) outperform the best of the other algorithms we examine by as much as 60 percent when information is moderately old, 3) significantly outperform random load distribution when information is older still, and 4) avoid pathological behavior even when information is extremely old.