Skip to Main Content
Temporal locality in workloads creates conditions in which a server, in order to remain available, should quickly process bursts of requests with large service requirements. In this paper, we show how to counteract the resulting peak congestions and maintain high availability by delaying selected requests that contribute to the temporal locality. We propose and evaluate SWAP, a measurement-based scheduling policy that approximates the shortest job first (SJF) scheduling without requiring any knowledge of job service times. We show that good service time estimates can be obtained from the temporal dependence structure of the workload and allow to closely approximate the behavior of SJF. Experimental results indicate that SWAP significantly improves system performability. In particular, we show that system capacity under SWAP is largely increased compared to first-come first-served (FCFS) scheduling and is highly-competitive with SJF, but without requiring a priori information of job service times.