Skip to Main Content
Benchmarks are essential for computer architecture research and performance evaluation. Constructing a good benchmark suite is, however, non-trivial: it must be representative, show different types of behavior and the benchmarks should not be easily tweaked. This paper uses principal components analysis, a statistical data analysis technique, to detect differences in behavior between benchmarks. Two specific types of benchmarks are identified. Eccentric benchmarks have a behavior that differs significantly from the other benchmarks. They are useful to incorporate different types of behavior in a suite. Fragile benchmarks are weak benchmarks: their execution time is determined almost entirely by a single bottleneck. Removing that bottleneck reduces their execution time excessively. This paper argues that fragile benchmarks are not useful and shows how they can be detected by means of workload characterization techniques. These techniques are applied to the SPEC CPU95 and CPU2000 benchmark suites. It is shown that these suites contain both eccentric and fragile benchmarks. The notions of eccentric and fragile benchmarks are important when composing a benchmark suite and to guide the sub-setting of a benchmark suite.