Skip to Main Content
We study the long term (steady state) performance of a simple, randomized, local load balancing technique. We assume a system of n processors connected by an arbitrary network topology. Jobs are placed in the processors by a deterministic or randomized adversary. The adversary knows the current and past load distribution in the network and can use this information to place the new tasks in the processors. The adversary can put a number of new jobs in each processor, in each step, as long as the (expected) total number of new jobs arriving at a given step is bounded by λn. A node can execute one job per step, and also participate in one load balancing operation in which it can move tasks to a direct neighbor in the network. In the protocol we analyze here, a node equalizes its load with a random neighbor in the graph. We first study the stability of a system running our load balancing protocol. Clearly, if λ > 1 the system cannot be stable. We show that for any λ < 1, and any connected network topology, the system is stable. When the system is stable, the next performance parameter of interest is the waiting time of jobs. We develop high probability bounds and bounds on the expectation of the waiting time of jobs in terms of the network topology. In particular, if the network is an expander graph the expected wait of a task is O(log n), and the waiting time of a task that enters the network at an arbitrary time is O(log n) with high probability. We contrast these results with the work stealing load balancing protocol, where we show that, in sparse networks, the load in the system and the waiting time can be exponential in the network size.