We present a novel hardware mechanism for dynamic program phase detection in distributed shared-memory (DSM) multiprocessors. We show that successful hardware mechanisms for phase detection in uniprocessors do not necessarily work well in DSM systems, since they lack the ability to incorporate the parallel application's global execution information and memory access behavior based on data distribution. We then propose a hardware extension to a well-known uniprocessor mechanism that significantly improves phase detection in the context of DSM multiprocessors. The resulting mechanism is modest in size and complexity, and is transparent to the parallel application
Published in:
Parallel and Distributed Processing Symposium, 2006. IPDPS 2006. 20th International
Date of Conference: 25-29 April 2006