Skip to Main Content
We present a framework for analyzing the performance of multithreaded programs using a model called a constraint graph. We review previous constraint graph definitions for sequentially consistent systems, and extend these definitions for use in analyzing other memory consistency models. Using this framework, we present two constraint graph analysis case studies using several commercial and scientific workloads running on a full system simulator. The first case study illustrates how a constraint graph can be used to determine the necessary conditions for implementing a memory consistency model, rather than conservative sufficient conditions. Using this method, we classify coherence misses as either required or unnecessary. We determine that on average one half of all load instructions, which suffer cache misses due to coherence activity, are unnecessarily stalled because the original copy of the cache line could have been used without violating the memory consistency model. The second case study demonstrates the effects of memory consistency constraints on the fundamental limits of instruction level parallelism, compared to previous estimates, which did not include multiprocessor constraints. Using this method we determine the fundamental performance differences of various memory consistency models for processors, which do not perform consistency-related speculation.