Skip to Main Content
Several commercial processors have architectures that include support for simultaneous multithreading (SMT), yet there is still not a validated methodology for estimating the performance of an SMT machine that does not rely on full program simulation. To create an efficient sampling approach for SMT we must determine how far to fast-forward each individual thread between samples. The fast-forwarding distance for each thread will vary according to execution phases, thread interactions and changes to the architectural configuration. We examine using individual program phase information to guide SMT simulation. This is accomplished by creating what we call a co-phase matrix. The co-phase matrix is populated by collecting samples of the programs' phase combinations, and is used to guide fastforwarding between samples. We show for 28 pairs of SPEC programs that using the co-phase matrix provides an average error rate of 4% while requiring that only 1% of the full simulation be performed. The methods are also validated using a variety of architectural configurations and four-threaded workloads.
Date of Conference: 2004