Skip to Main Content
Performance evaluations of large-scale systems require the use of representative workloads with certifiable similar or dissimilar characteristics. To quantify the similarity of the characteristics, we describe a novel measure comprising two efficient methods that are suitable for large-scale workloads. One method uses the discrete wavelet transform to assess the periodic time and frequency characteristics in the workload. The second method evaluates dependencies in descriptive attributes via association rule learning. Both methods are evaluated to find the limits of their similarity spaces. Additionally, the wavelet method is evaluated against existing similarity methods and tested for noise robustness and random bias. An empirical study using workloads from seven operational large-scale systems evaluates the measure's accuracy. The results show that our measure is highly resistant to noise, well-suited for large-scale workloads, covers 87% of the possible similarity space, and improves accuracy by 24.5% and standard deviation by 10.8% when compared to existing work.