Skip to Main Content
The complexity of modern computer systems may enable minor variations in performance evaluation procedures to actually determine the outcome. Our case study concerns the comparison of two parallel job schedulers, using different workloads and metrics. It shows that metrics may be sensitive to different job classes, and not measure the performance of the whole workload in an impartial manner. Workload models may implicitly assume that some workload attribute is unimportant and does not warrant modeling; this too can turn out to be wrong. As such effects are hard to predict, a careful experimental methodology is needed in order to find and verify them.